covega / enviro_papers

Take datasets on the environment and slot them into candidate specific research papers
MIT License
0 stars 0 forks source link

enviro_papers

Take datasets on the environment and slot them into candidate specific research papers

Setup

We're using Python 3.6 and Jupyter Notebook.

  1. Install Jupyter Notebook

  2. Install Virtualenv

    pip install virtualenv

Development

# Enter virtual environment and install dependencies
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

# Create SQL database
python run.py

# Open Notebook
jupyter notebook

Directory Structure

├── papers.ipynb        # Python Notebook for generating Word docs
├── README.md           # This README
├── app/ 
│   ├── __init__.py     # Runs all app functions
│   ├── config.py       # Configuration values
│   ├── create_tables/  # Scripts that import data from 
│   ├── models/         # SQL table definitions
│   ├── queries/        # Queries that create district-level data
│   ├── templates/      # Word templates
│   └── util.py         # Useful functions
├── data 
│   ├── cleaned         # Data intended for import into SQL
│   └── raw             # Raw data
├── papers.db           # Database file
├── requirements.txt    # Python packages
├── run.py              # Python script that runs the app
├── scripts             # Scripts run to clean data
└── venv                # Virtual environment files

Data

Instructions on how to find and reproduce our data inputs

ALA asthma data

We are capturing the number of children and adults that have asthma in each county. The ALA has data on their website, and we're pulling it using the underlying API.

Polling data

We're using Yale's Climate Opionion map data from 2019 to see what people think about the environment and climate change.

Voting data

We're transforming state level scorecard PDFs that list every legislator's votes for and against climate issues into Excel files that are machine readable.

Clean energy jobs data

In development...

Daily Kos data

Right now we are only using the Counties ↔ congressional district correspondences data set and the Counties ↔ legislative district correspondences from those provided by Daily Kos. The script assumes that you take the Google Sheet and choose File > Download > Microsoft Excel (.xlsx). Then, after placing in data/daily-kos/, run the following script:

# Enter virtual environment and install dependencies
virtualenv .
source bin/activate
pip install -r requirements.txt

# Run the Script
python scripts/clean_daily_kos.py