Closed taransamarth closed 1 year ago
In New York, districts must, per judicial order:
When developing the 2010 map, the courts decided to assign zero weight to incumbent protection and minimal weight to core preservation.
We enforce a maximum population deviation of 0.5%.
Data for New York comes from the ALARM Project's 2010 Redistricting Data Files.
We use a county constraint to preserve district cores, since districts are generally structured around counties.
We sample 40,000 districting plans for New York over two runs of the SMC algorithm and thin the sample down to 5,000 plans.
No special techniques were needed to produce the sample.
SMC: 40,000 sampled plans of 27 districts on 14,926 units `adapt_k_thresh`=0.985 • `seq_alpha`=0.95 `est_label_mult`=1 • `pop_temper`=0.001 Plan diversity 80% range: 0.84 to 0.96 R-hat values for summary statistics: pop_overlap total_vap plan_dev comp_edge comp_polsby pop_white pop_black 1.063915 1.013900 1.031622 1.036186 1.009594 1.076823 1.058996 pop_hisp pop_aian pop_asian pop_nhpi pop_other pop_two vap_white 1.078089 1.070093 1.063395 1.011578 1.006390 1.049825 1.067398 vap_black vap_hisp vap_aian vap_asian vap_nhpi vap_other vap_two 1.059014 1.082387 1.067186 1.065565 1.019735 1.008577 1.044736 pre_16_dem_cli pre_16_rep_tru pre_20_dem_bid pre_20_rep_tru uss_16_dem_sch uss_16_rep_lon uss_18_dem_gil 1.002592 1.090329 1.002190 1.090287 1.002718 1.068018 1.001814 uss_18_rep_far gov_18_dem_cuo gov_18_rep_mol atg_18_dem_jam atg_18_rep_wof adv_16 adv_18 1.071867 1.002289 1.075374 1.002081 1.070159 1.000305 1.001815 adv_20 arv_16 arv_18 arv_20 county_splits muni_splits ndv 1.002190 1.080993 1.072695 1.090287 1.092228 1.022409 1.001190 nrv ndshare e_dvs pr_dem e_dem pbias egap 1.081729 1.065488 1.067648 1.076577 1.000764 1.049656 1.002305 ✖ WARNING: SMC runs have not converged. Sampling diagnostics for SMC run 1 of 2 (20,000 samples) Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k Split 1 17,649 (88.2%) 19.4% 0.38 12,676 (100%) 15 Split 2 15,033 (75.2%) 31.4% 0.45 12,197 ( 96%) 9 Split 3 14,440 (72.2%) 38.3% 0.52 11,923 ( 94%) 7 Split 4 12,334 (61.7%) 47.5% 0.53 11,737 ( 93%) 5 Split 5 7,596 (38.0%) 41.1% 0.55 11,520 ( 91%) 6 Split 6 7,979 (39.9%) 25.6% 0.56 11,309 ( 89%) 10 Split 7 5,768 (28.8%) 38.9% 0.59 11,313 ( 89%) 6 Split 8 9,787 (48.9%) 42.0% 0.58 11,037 ( 87%) 5 Split 9 9,117 (45.6%) 54.3% 0.58 11,313 ( 89%) 3 Split 10 8,686 (43.4%) 29.9% 0.59 11,272 ( 89%) 7 Split 11 11,581 (57.9%) 43.3% 0.59 11,266 ( 89%) 4 Split 12 10,268 (51.3%) 48.5% 0.59 11,309 ( 89%) 3 Split 13 6,396 (32.0%) 29.1% 0.60 11,235 ( 89%) 6 Split 14 9,415 (47.1%) 24.0% 0.60 11,051 ( 87%) 7 Split 15 7,093 (35.5%) 35.5% 0.77 10,544 ( 83%) 4 Split 16 9,186 (45.9%) 40.4% 0.79 10,508 ( 83%) 3 Split 17 9,160 (45.8%) 46.0% 0.82 10,623 ( 84%) 2 Split 18 8,660 (43.3%) 43.4% 0.83 10,385 ( 82%) 2 Split 19 8,986 (44.9%) 40.9% 0.85 10,435 ( 83%) 2 Split 20 8,885 (44.4%) 22.2% 0.87 10,267 ( 81%) 5 Split 21 7,042 (35.2%) 30.0% 0.89 10,219 ( 81%) 3 Split 22 7,321 (36.6%) 32.6% 0.89 9,969 ( 79%) 2 Split 23 9,186 (45.9%) 15.5% 0.87 9,788 ( 77%) 5 Split 24 7,257 (36.3%) 16.5% 0.85 9,694 ( 77%) 4 Split 25 8,487 (42.4%) 13.2% 0.81 9,530 ( 75%) 4 Split 26 8,603 (43.0%) 3.4% 0.75 9,003 ( 71%) 6 Resample 7,609 (38.0%) NA% 0.83 10,245 ( 81%) NA Sampling diagnostics for SMC run 2 of 2 (20,000 samples) Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k Split 1 17,637 (88.2%) 22.3% 0.38 12,655 (100%) 13 Split 2 14,759 (73.8%) 31.3% 0.44 12,144 ( 96%) 9 Split 3 11,481 (57.4%) 43.0% 0.51 12,005 ( 95%) 6 Split 4 12,204 (61.0%) 33.0% 0.53 11,704 ( 93%) 8 Split 5 8,715 (43.6%) 36.4% 0.55 11,626 ( 92%) 7 Split 6 12,089 (60.4%) 45.4% 0.55 11,406 ( 90%) 5 Split 7 8,473 (42.4%) 38.5% 0.56 11,401 ( 90%) 6 Split 8 11,684 (58.4%) 26.3% 0.57 11,352 ( 90%) 9 Split 9 7,604 (38.0%) 35.7% 0.58 11,312 ( 89%) 6 Split 10 9,205 (46.0%) 39.1% 0.59 11,252 ( 89%) 5 Split 11 10,376 (51.9%) 37.0% 0.59 11,242 ( 89%) 5 Split 12 10,479 (52.4%) 35.4% 0.61 11,221 ( 89%) 5 Split 13 10,653 (53.3%) 39.1% 0.60 11,269 ( 89%) 4 Split 14 8,006 (40.0%) 44.0% 0.60 11,150 ( 88%) 3 Split 15 9,262 (46.3%) 49.9% 0.76 10,531 ( 83%) 2 Split 16 8,906 (44.5%) 34.1% 0.81 10,520 ( 83%) 4 Split 17 7,781 (38.9%) 26.7% 0.84 10,436 ( 83%) 5 Split 18 7,805 (39.0%) 30.2% 0.87 10,276 ( 81%) 4 Split 19 6,385 (31.9%) 28.4% 0.89 10,205 ( 81%) 4 Split 20 6,369 (31.8%) 32.3% 0.88 10,124 ( 80%) 3 Split 21 5,909 (29.5%) 36.5% 0.90 10,063 ( 80%) 2 Split 22 6,900 (34.5%) 32.6% 0.87 9,840 ( 78%) 2 Split 23 7,381 (36.9%) 28.3% 0.87 9,747 ( 77%) 2 Split 24 6,028 (30.1%) 24.0% 0.85 9,490 ( 75%) 2 Split 25 6,781 (33.9%) 19.1% 0.80 9,152 ( 72%) 2 Split 26 9,092 (45.5%) 7.4% 0.73 8,577 ( 68%) 2 Resample 8,196 (41.0%) NA% 0.83 10,427 ( 82%) NA • Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05. • SMC convergence: Increase the number of samples. If you are experiencing low plan diversity or bottlenecks as well, address those issues first.
SMC: 5,000 sampled plans of 27 districts on 14,926 units `adapt_k_thresh`=0.985 • `seq_alpha`=0.95 `est_label_mult`=1 • `pop_temper`=0.001 Plan diversity 80% range: 0.82 to 0.96 R-hat values for summary statistics: pop_overlap total_vap plan_dev comp_edge comp_polsby pop_white pop_black 1.057204 1.015243 1.026230 1.033367 1.008623 1.076971 1.059918 pop_hisp pop_aian pop_asian pop_nhpi pop_other pop_two vap_white 1.079270 1.063427 1.056164 1.010298 1.011788 1.048506 1.068323 vap_black vap_hisp vap_aian vap_asian vap_nhpi vap_other vap_two 1.060618 1.083367 1.060725 1.058351 1.016829 1.010043 1.045474 pre_16_dem_cli pre_16_rep_tru pre_20_dem_bid pre_20_rep_tru uss_16_dem_sch uss_16_rep_lon uss_18_dem_gil 1.002666 1.095105 1.002145 1.094742 1.002692 1.070332 1.002164 uss_18_rep_far gov_18_dem_cuo gov_18_rep_mol atg_18_dem_jam atg_18_rep_wof adv_16 adv_18 1.073941 1.002275 1.077567 1.002090 1.072197 1.000183 1.001720 adv_20 arv_16 arv_18 arv_20 county_splits muni_splits ndv 1.002145 1.084494 1.074847 1.094742 1.096600 1.018751 1.001040 nrv ndshare e_dvs.x pr_dem.x e_dem.x pbias.x egap.x 1.085177 1.067367 1.069494 1.072567 1.002516 1.050635 1.004869 e_dvs.y pr_dem.y e_dem.y pbias.y egap.y 1.069494 1.072567 1.002516 1.050635 1.004869 ✖ WARNING: SMC runs have not converged. Sampling diagnostics for SMC run 1 of 2 (20,000 samples) Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k Split 1 17,649 (88.2%) 19.4% 0.38 12,676 (100%) 15 Split 2 15,033 (75.2%) 31.4% 0.45 12,197 ( 96%) 9 Split 3 14,440 (72.2%) 38.3% 0.52 11,923 ( 94%) 7 Split 4 12,334 (61.7%) 47.5% 0.53 11,737 ( 93%) 5 Split 5 7,596 (38.0%) 41.1% 0.55 11,520 ( 91%) 6 Split 6 7,979 (39.9%) 25.6% 0.56 11,309 ( 89%) 10 Split 7 5,768 (28.8%) 38.9% 0.59 11,313 ( 89%) 6 Split 8 9,787 (48.9%) 42.0% 0.58 11,037 ( 87%) 5 Split 9 9,117 (45.6%) 54.3% 0.58 11,313 ( 89%) 3 Split 10 8,686 (43.4%) 29.9% 0.59 11,272 ( 89%) 7 Split 11 11,581 (57.9%) 43.3% 0.59 11,266 ( 89%) 4 Split 12 10,268 (51.3%) 48.5% 0.59 11,309 ( 89%) 3 Split 13 6,396 (32.0%) 29.1% 0.60 11,235 ( 89%) 6 Split 14 9,415 (47.1%) 24.0% 0.60 11,051 ( 87%) 7 Split 15 7,093 (35.5%) 35.5% 0.77 10,544 ( 83%) 4 Split 16 9,186 (45.9%) 40.4% 0.79 10,508 ( 83%) 3 Split 17 9,160 (45.8%) 46.0% 0.82 10,623 ( 84%) 2 Split 18 8,660 (43.3%) 43.4% 0.83 10,385 ( 82%) 2 Split 19 8,986 (44.9%) 40.9% 0.85 10,435 ( 83%) 2 Split 20 8,885 (44.4%) 22.2% 0.87 10,267 ( 81%) 5 Split 21 7,042 (35.2%) 30.0% 0.89 10,219 ( 81%) 3 Split 22 7,321 (36.6%) 32.6% 0.89 9,969 ( 79%) 2 Split 23 9,186 (45.9%) 15.5% 0.87 9,788 ( 77%) 5 Split 24 7,257 (36.3%) 16.5% 0.85 9,694 ( 77%) 4 Split 25 8,487 (42.4%) 13.2% 0.81 9,530 ( 75%) 4 Split 26 8,603 (43.0%) 3.4% 0.75 9,003 ( 71%) 6 Resample 7,609 (38.0%) NA% 0.83 10,245 ( 81%) NA Sampling diagnostics for SMC run 2 of 2 (20,000 samples) Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k Split 1 17,637 (88.2%) 22.3% 0.38 12,655 (100%) 13 Split 2 14,759 (73.8%) 31.3% 0.44 12,144 ( 96%) 9 Split 3 11,481 (57.4%) 43.0% 0.51 12,005 ( 95%) 6 Split 4 12,204 (61.0%) 33.0% 0.53 11,704 ( 93%) 8 Split 5 8,715 (43.6%) 36.4% 0.55 11,626 ( 92%) 7 Split 6 12,089 (60.4%) 45.4% 0.55 11,406 ( 90%) 5 Split 7 8,473 (42.4%) 38.5% 0.56 11,401 ( 90%) 6 Split 8 11,684 (58.4%) 26.3% 0.57 11,352 ( 90%) 9 Split 9 7,604 (38.0%) 35.7% 0.58 11,312 ( 89%) 6 Split 10 9,205 (46.0%) 39.1% 0.59 11,252 ( 89%) 5 Split 11 10,376 (51.9%) 37.0% 0.59 11,242 ( 89%) 5 Split 12 10,479 (52.4%) 35.4% 0.61 11,221 ( 89%) 5 Split 13 10,653 (53.3%) 39.1% 0.60 11,269 ( 89%) 4 Split 14 8,006 (40.0%) 44.0% 0.60 11,150 ( 88%) 3 Split 15 9,262 (46.3%) 49.9% 0.76 10,531 ( 83%) 2 Split 16 8,906 (44.5%) 34.1% 0.81 10,520 ( 83%) 4 Split 17 7,781 (38.9%) 26.7% 0.84 10,436 ( 83%) 5 Split 18 7,805 (39.0%) 30.2% 0.87 10,276 ( 81%) 4 Split 19 6,385 (31.9%) 28.4% 0.89 10,205 ( 81%) 4 Split 20 6,369 (31.8%) 32.3% 0.88 10,124 ( 80%) 3 Split 21 5,909 (29.5%) 36.5% 0.90 10,063 ( 80%) 2 Split 22 6,900 (34.5%) 32.6% 0.87 9,840 ( 78%) 2 Split 23 7,381 (36.9%) 28.3% 0.87 9,747 ( 77%) 2 Split 24 6,028 (30.1%) 24.0% 0.85 9,490 ( 75%) 2 Split 25 6,781 (33.9%) 19.1% 0.80 9,152 ( 72%) 2 Split 26 9,092 (45.5%) 7.4% 0.73 8,577 ( 68%) 2 Resample 8,196 (41.0%) NA% 0.83 10,427 ( 82%) NA • Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05. • SMC convergence: Increase the number of samples. If you are experiencing low plan diversity or bottlenecks as well, address those issues first.
TODO
enforce_style()
redist_map
redist_plans
@christopherkenny
Redistricting requirements
In New York, districts must, per judicial order:
When developing the 2010 map, the courts decided to assign zero weight to incumbent protection and minimal weight to core preservation.
Algorithmic Constraints
We enforce a maximum population deviation of 0.5%.
Data Sources
Data for New York comes from the ALARM Project's 2010 Redistricting Data Files.
Pre-processing Notes
We use a county constraint to preserve district cores, since districts are generally structured around counties.
Simulation Notes
We sample 40,000 districting plans for New York over two runs of the SMC algorithm and thin the sample down to 5,000 plans.
No special techniques were needed to produce the sample.
Validation
40,000 plans
5,000 plans (thinned)
Checklist
TODO
lines from the template code have been removedenforce_style()
to format my coderedist_map
andredist_plans
objects, and summary statistics) have been edited@christopherkenny