csjohns / pb-voter-turnout

Analysis of effects from participatory budgeting on voter turnout in New York City
2 stars 1 forks source link

PBNYC Voter Turnout

This is the repository for code for the article "Participatory Budgeting and Voter Turnout" in the journal Political Behavior.

All scripts required to replicate the analysis are in the main project directory. Data collection and cleaning scripts and code to scrape the NYC BOE website to generate competitiveness measures are in R_data_collection and R_competitiveness directories respectively.

To replicate the analysis, clone or fork this repository. The de-identified individual-level data can be obtained upon request by emailing info@nycet.org with the subject line "Requesting replication data for PB voter turnout study". The voterfile_full_clean_deid.rds file provided should be saved into the data/cleaned_R_results directory.

To replicate the analysis, start at step 1 as enumerated in the table below. This table outlines all scripts and script/data dependencies that must be run in sequence to replicate the results.

Order Script Summary Scripts called Data in Data out Fig/table out
0- R_data_collection/rr_vf_processing_full.R Creates the full voterfile with all auxiliary information except ED turnout. This file and dependencies are not needed to run replication, are included for transparency in dataset creation. vf_gis_nyccdmatch.R
pb_cleanup_addnyccd_foranalysis.R
rr_vf_aux_processing.R
(R_data_collection_/rr_working_district_match.R for district match_res.rds)
personfile,
pb data,
pb_district_votes.csv,
district_match_res.rds
cleaned_R_results/voterfile_full_clean_deid.rds
0a R_data_collection/vf_gis_nyccdmatch.R shapefiles/nycc_18c/nycc.shp,
ed-nyccd-map.csv
0b R_data_collection/pb_cleanup_addnyccd_foranalysis.R ed-nyccd-map.csv,
pbnyc_district_votes.csv
0c R_data_collection/rr_vf_aux_processing.R R_data_collection/censustables.R
BOE_pres_process.R
voters_census.rds,
wide_compet_clean_deid.rds,
council_districts.rds
0c-1 R_data_collection/censustables.R Downloads ACS data for NYC census tracts census.Rdata
0c-2 R_competitiveness/BOE_pres_process.R Attached presidential election margins to city/district election data data/pres_elec_res.csv
0d R_data_collection/rr_working_district_match.R matching districts by district-level characteristics to identify most-similar match groups voterfile_for_matching.rds,
council_districts_wmargin.rds,
pbnyc_district_votes
district_match_res.rds
1- rr_vf_processing_for_matching.R *Start Here*
Splits full voterfile into comparison sets (suffix = "" (main), suffix = "_placebo", suffix = "within_dist")
voterfile_full_clean_deid.rds,
pbdistricts.rds,
pbnyc_district_votes.csv
voter_file_for_matching_SUFFIX.rds
2- rr_vf_matching_iterate File to execute iterated matching with different match specifications. Must be rerun for each major comparison framework (suffix = "" (main), suffix = "within_dist") rr_exact_match.R,
rr_matching_levels_SUFFIX.R
rr_matching_functions.R
voterfile_for_matching_SUFFIX.rds,
matchablevans_SUFFIX.rds (created by rr_exact_match.R)
matching_res_SUFFIX.rds
2a rr_exact_match.R File to run exact match to find possible subset of matchable IDs to speed up coarsened match matchablevans_SUFFIX.rds
2b rr_matching_levels_SUFFIX.R Setting match levels and loading cutpoints for each match rr_matching_cutpoints.R creates the cutpoints file used here
cutpoints.Rdata
2b-1 rr_matching_cutpoints.R Calculating population based cutoff values for CEM matching voterfile_full_clean_deid.rds cutpoints.Rdata
3- rr_vf_regression_iterate.R Load data, attach ED-level competitiveness measures, iterate over regression models and save results, rerun for each comparison framework (suffix = "" (main), suffix = "within_dist") create_pb_long.R,
rr_regression_functions.R
matching_res_SUFFIX.RDS,
voterfile_for_matching_SUFFIX.rds
iter_regress_check_SUFFIX.rds (full models),
iter_regress_lmers_SUFFIX.rds (just summaries)
3-2 rr_vf_regression_iterate_placebo.R iterated regressions for placebo model; data structure, functions and model slightly differ because no variable start to PB (district-wide); Note this is VERY memory-hungry - I had to run on a 32G memory machine create_pb_long_placebo.R,
rr_regression_functions.R
matching_res_placebo.RDS,
voterfile_for_matching_placebo.rds
iter_regress_check_placebo.rds,
iter_regress_lmers_placebo.rds
3-2a create_pb_long_placebo.R reshape voterfile long for regression for placebo model
3a create_pb_long.R main function to reshape file long for regression
3b rr_regression_functions.R general functions to preprecess data and run regressions
4 rr_vf_regression_compare_method.R load all iterated models and create comparison fig for paper iter_regress_lmers_within_dist.rds,
iter_regress_lmers_placebo.rds,
iter_regress_lmers.rds
Paper_text/Figs/robust_compare.pdf
5 rr_vf_regression_final.R Load data, attach ED-level competitiveness measures, and run regression models presented in paper. create_pb_long.R,
rr_regression_functions.R
matching_res.RDS,
voterfile_for_matching.rds
main_effects.rds mainregs_raw.tex
6 rr_vf_regression_preds.R Reruns main regression AND subgroup interactions in full dummy format for prediction and visualizations use with Chris Adolph's simcf & tile packages (http://faculty.washington.edu/cadolph/?page=60 still so useful for model inference!) create_pb_long.R,
rr_regression_functions.R
matching_res.RDS,
voterfile_for_matching.rds
data/temp/subgroup_res_tractfine.rds group_fds_bothyears.pdf (and many other intermediate figs),
subgroups_SG.tex
7 descriptives.R create general descriptive statistics voterfile_full_clean_deid.rds,
pbdistricts.rds
turnout.pdf,
districtvotes.pdf,
age2.pdf, race2.pdf, hhinc2.pdf, college2.pdf
8 Paper_text/online_appendix.Rmd Markdown file for robustness checks/online appendix rr_matching_balance_functions.R
voterfile_for_matching.rds,
matching_res.RDS,
iter_regress_lmers.rds,
iter_regress_lmers_within_dist.rds,
iter_regress_lmers_placebo.rds
iter_regress_lmers.rds
online_appendix.html
8a rr_matching_balance_functions helper functions for match balance comparison; label assignments