PBNYC Voter Turnout

This is the repository for code for the article "Participatory Budgeting and Voter Turnout" in the journal Political Behavior.

All scripts required to replicate the analysis are in the main project directory. Data collection and cleaning scripts and code to scrape the NYC BOE website to generate competitiveness measures are in R_data_collection and R_competitiveness directories respectively.

To replicate the analysis, clone or fork this repository. The de-identified individual-level data can be obtained upon request by emailing info@nycet.org with the subject line "Requesting replication data for PB voter turnout study". The voterfile_full_clean_deid.rds file provided should be saved into the data/cleaned_R_results directory.

To replicate the analysis, start at step 1 as enumerated in the table below. This table outlines all scripts and script/data dependencies that must be run in sequence to replicate the results.

Order	Script	Summary	Scripts called	Data in	Data out	Fig/table out
0-	R_data_collection/rr_vf_processing_full.R	Creates the full voterfile with all auxiliary information except ED turnout. This file and dependencies are not needed to run replication, are included for transparency in dataset creation.	vf_gis_nyccdmatch.R pb_cleanup_addnyccd_foranalysis.R rr_vf_aux_processing.R (R_data_collection_/rr_working_district_match.R for district match_res.rds)	personfile, pb data, pb_district_votes.csv, district_match_res.rds	cleaned_R_results/voterfile_full_clean_deid.rds
0a	R_data_collection/vf_gis_nyccdmatch.R			shapefiles/nycc_18c/nycc.shp, ed-nyccd-map.csv
0b	R_data_collection/pb_cleanup_addnyccd_foranalysis.R		ed-nyccd-map.csv, pbnyc_district_votes.csv
0c	R_data_collection/rr_vf_aux_processing.R		R_data_collection/censustables.R BOE_pres_process.R	voters_census.rds, wide_compet_clean_deid.rds, council_districts.rds
0c-1	R_data_collection/censustables.R	Downloads ACS data for NYC census tracts			census.Rdata
0c-2	R_competitiveness/BOE_pres_process.R	Attached presidential election margins to city/district election data		data/pres_elec_res.csv
0d	R_data_collection/rr_working_district_match.R	matching districts by district-level characteristics to identify most-similar match groups		voterfile_for_matching.rds, council_districts_wmargin.rds, pbnyc_district_votes	district_match_res.rds
1-	rr_vf_processing_for_matching.R	Start Here Splits full voterfile into comparison sets (suffix = "" (main), suffix = "_placebo", suffix = "within_dist")		voterfile_full_clean_deid.rds, pbdistricts.rds, pbnyc_district_votes.csv	voter_file_for_matching_SUFFIX.rds
2-	rr_vf_matching_iterate	File to execute iterated matching with different match specifications. Must be rerun for each major comparison framework (suffix = "" (main), suffix = "within_dist")	rr_exact_match.R, rr_matching_levels_SUFFIX.R rr_matching_functions.R	voterfile_for_matching_SUFFIX.rds, matchablevans_SUFFIX.rds (created by rr_exact_match.R)	matching_res_SUFFIX.rds
2a	rr_exact_match.R	File to run exact match to find possible subset of matchable IDs to speed up coarsened match			matchablevans_SUFFIX.rds
2b	rr_matching_levels_SUFFIX.R	Setting match levels and loading cutpoints for each match	rr_matching_cutpoints.R creates the cutpoints file used here	cutpoints.Rdata
2b-1	rr_matching_cutpoints.R	Calculating population based cutoff values for CEM matching		voterfile_full_clean_deid.rds	cutpoints.Rdata
3-	rr_vf_regression_iterate.R	Load data, attach ED-level competitiveness measures, iterate over regression models and save results, rerun for each comparison framework (suffix = "" (main), suffix = "within_dist")	create_pb_long.R, rr_regression_functions.R	matching_res_SUFFIX.RDS, voterfile_for_matching_SUFFIX.rds	iter_regress_check_SUFFIX.rds (full models), iter_regress_lmers_SUFFIX.rds (just summaries)
3-2	rr_vf_regression_iterate_placebo.R	iterated regressions for placebo model; data structure, functions and model slightly differ because no variable start to PB (district-wide); Note this is VERY memory-hungry - I had to run on a 32G memory machine	create_pb_long_placebo.R, rr_regression_functions.R	matching_res_placebo.RDS, voterfile_for_matching_placebo.rds	iter_regress_check_placebo.rds, iter_regress_lmers_placebo.rds
3-2a	create_pb_long_placebo.R	reshape voterfile long for regression for placebo model
3a	create_pb_long.R	main function to reshape file long for regression
3b	rr_regression_functions.R	general functions to preprecess data and run regressions
4	rr_vf_regression_compare_method.R	load all iterated models and create comparison fig for paper		iter_regress_lmers_within_dist.rds, iter_regress_lmers_placebo.rds, iter_regress_lmers.rds		Paper_text/Figs/robust_compare.pdf
5	rr_vf_regression_final.R	Load data, attach ED-level competitiveness measures, and run regression models presented in paper.	create_pb_long.R, rr_regression_functions.R	matching_res.RDS, voterfile_for_matching.rds	main_effects.rds	mainregs_raw.tex
6	rr_vf_regression_preds.R	Reruns main regression AND subgroup interactions in full dummy format for prediction and visualizations use with Chris Adolph's simcf & tile packages (http://faculty.washington.edu/cadolph/?page=60 still so useful for model inference!)	create_pb_long.R, rr_regression_functions.R	matching_res.RDS, voterfile_for_matching.rds	data/temp/subgroup_res_tractfine.rds	group_fds_bothyears.pdf (and many other intermediate figs), subgroups_SG.tex
7	descriptives.R	create general descriptive statistics		voterfile_full_clean_deid.rds, pbdistricts.rds		turnout.pdf, districtvotes.pdf, age2.pdf, race2.pdf, hhinc2.pdf, college2.pdf
8	Paper_text/online_appendix.Rmd	Markdown file for robustness checks/online appendix	rr_matching_balance_functions.R	voterfile_for_matching.rds, matching_res.RDS, iter_regress_lmers.rds, iter_regress_lmers_within_dist.rds, iter_regress_lmers_placebo.rds iter_regress_lmers.rds		online_appendix.html
8a	rr_matching_balance_functions	helper functions for match balance comparison; label assignments

csjohns / pb-voter-turnout

readme

PBNYC Voter Turnout