A tool for digitizing election results data in the form of handwritten digits.
The instructions below should get you setup for a development environment. To get going in production, follow the instructions in DEPLOYMENT.md.
Install OS level dependencies:
Clone this repo & install app requirements
We recommend using virtualenv and virtualenvwrapper for working in a virtualized development environment. Read how to set up virtualenv.
Once you have virtualenvwrapper set up,
mkvirtualenv et
git clone git@github.com:datamade/election-transcriber.git
cd election-transcriber
pip install -r requirements.txt
Create a PostgreSQL database for election transcriber If you aren't already running PostgreSQL, we recommend installing version 9.6 or later.
createdb election_transcriber
Create your own app_config.py
cp transcriber/app_config.py.example transcriber/app_config.py
You will need to change, at minimum:
and DB_PW
to reflect your PostgreSQL username/password (by default, the username is your computer name & the password is '')S3_BUCKET
to tell the application where to look for your cache of images
tells the application where to find the CSV file
with your AWS credentials in it. By default, the application looks for
a file called credenitals.csv
in the root folder of the project.You can also change the username, email and password for the initial user roles, defined by ADMIN_USER
Create your own alembic.ini
cp alembic.ini.example alembic.ini
You will need to change, at minimum, user
& pass
(to reflect your PostgreSQL username/password) on line 6
Initialize the database
alembic upgrade head
Import images
python update_images.py
Run the app
python runserver.py
In another terminal, run the worker
python run_queue.py
Once the server is running, navigate to http://localhost:5000/
There is a script in the root folder of the project called
. As you might guess, it's the script that is responsible
for syncing files from a Google Drive folder to an AWS S3 bucket.
Setup Google Service Account
"type": "service_account",
"project_id": "[name of the project]",
"private_key_id": "[long hash]",
"private_key": "[very very long hash]",
"client_email": "some-user@project-name.iam.gserviceaccount.com",
"client_id": "[long number]",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "[long URL]"
As was explained in the part where you download that, the contents of this file should be kept secret.
address from that JSON file.Setup AWS User
"Version": "2012-10-17",
"Statement": [
"Sid": "Stmt1508430268000",
"Effect": "Allow",
"Action": [
"Resource": [
"Effect": "Allow",
"Action": [
"Resource": [
To run the syncDriveFolder.py
script, just put the credentials file from
Google and the credentials file from AWS in the root folder of the project run
the script like
python syncDriveFolder.py -f [name_of_drive_folder] -n [name_of_election]
A full list of options for that script can be seen by running python syncDriveFolder.py --help
usage: syncDriveFolder.py [-h] [--aws-creds AWS_CREDS]
[--google-creds GOOGLE_CREDS] -n ELECTION_NAME -f
DRIVE_FOLDER [--capture-hierarchy]
Sync and convert images from a Google Drive Folder to an S3 Bucket
optional arguments:
-h, --help show this help message and exit
--aws-creds AWS_CREDS
Path to AWS credentials. (default:
--google-creds GOOGLE_CREDS
Path to Google credentials. (default:
Short name to be used under the hood for the election
(default: None)
-f DRIVE_FOLDER, --drive-folder DRIVE_FOLDER
Name of the Google Drive folder to sync (default:
--capture-hierarchy Capture a geographical hierarchy from the name of the
file. (default: False)