Small archiving project using twarc & MongoDB to retain Twitter metadata & download native (not linked) images.
usage: epheme.py [-h] [-i IMAGES] [-d DATABASE] text
positional arguments:
text string to use in the Twitter search API
optional arguments:
-h, --help show this help message and exit
-i IMAGES, --images IMAGES
directory to place downloaded images in
-d DATABASE, --database DATABASE
name of database to store JSON in
Images directory defaults to "img". Database defaults to the search text with spaces replaced by underscores (e.g. "#search_term").
The functionality built on top of twarc is modest but I'm specifically interested in grabbing images from Twitter. Setting epheme to run regularly (e.g. with cron) lets me continually archive search results, related images, & use mongo's API to display them. There's a small example included here.
Requires virtualenv & mongodb. On a Mac with homebrew, you can get these with sudo pip install virtualenv; brew install mongodb
.
# set up virtual env in the project root, activate, install dependencies
virtualenv .
source bin/activate
pip install -r requirements.txt
# set OAUTH keys for Twitter in shell ENV
export CONSUMER_KEY abcdefgh12345678abcedefgh
# same for CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET…
# start mongo, run daemon in the background
# the included config file isn't necessary, just a convenience
mongod --config mongod.conf &
CC0 Public Domain