alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

2010 New York Congressional Districts #138

Closed taransamarth closed 1 year ago

taransamarth commented 2 years ago

Redistricting requirements

In New York, districts must, per judicial order:

  1. be contiguous
  2. have equal populations
  3. be geographically compact
  4. preserve political subdivisions, communities of interest, and cores of existing districts
  5. protect incumbents where possible.

When developing the 2010 map, the courts decided to assign zero weight to incumbent protection and minimal weight to core preservation.

Algorithmic Constraints

We enforce a maximum population deviation of 0.5%.

Data Sources

Data for New York comes from the ALARM Project's 2010 Redistricting Data Files.

Pre-processing Notes

We use a county constraint to preserve district cores, since districts are generally structured around counties.

Simulation Notes

We sample 40,000 districting plans for New York over two runs of the SMC algorithm and thin the sample down to 5,000 plans.

No special techniques were needed to produce the sample.

Validation

40,000 plans

image
SMC: 40,000 sampled plans of 27 districts on 14,926 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.95
`est_label_mult`=1 • `pop_temper`=0.001

Plan diversity 80% range: 0.84 to 0.96

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white      pop_black 
      1.063915       1.013900       1.031622       1.036186       1.009594       1.076823       1.058996 
      pop_hisp       pop_aian      pop_asian       pop_nhpi      pop_other        pop_two      vap_white 
      1.078089       1.070093       1.063395       1.011578       1.006390       1.049825       1.067398 
     vap_black       vap_hisp       vap_aian      vap_asian       vap_nhpi      vap_other        vap_two 
      1.059014       1.082387       1.067186       1.065565       1.019735       1.008577       1.044736 
pre_16_dem_cli pre_16_rep_tru pre_20_dem_bid pre_20_rep_tru uss_16_dem_sch uss_16_rep_lon uss_18_dem_gil 
      1.002592       1.090329       1.002190       1.090287       1.002718       1.068018       1.001814 
uss_18_rep_far gov_18_dem_cuo gov_18_rep_mol atg_18_dem_jam atg_18_rep_wof         adv_16         adv_18 
      1.071867       1.002289       1.075374       1.002081       1.070159       1.000305       1.001815 
        adv_20         arv_16         arv_18         arv_20  county_splits    muni_splits            ndv 
      1.002190       1.080993       1.072695       1.090287       1.092228       1.022409       1.001190 
           nrv        ndshare          e_dvs         pr_dem          e_dem          pbias           egap 
      1.081729       1.065488       1.067648       1.076577       1.000764       1.049656       1.002305 
✖ WARNING: SMC runs have not converged.

Sampling diagnostics for SMC run 1 of 2 (20,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd   Max. unique Est. k 
Split 1    17,649 (88.2%)     19.4%        0.38 12,676 (100%)     15 
Split 2    15,033 (75.2%)     31.4%        0.45 12,197 ( 96%)      9 
Split 3    14,440 (72.2%)     38.3%        0.52 11,923 ( 94%)      7 
Split 4    12,334 (61.7%)     47.5%        0.53 11,737 ( 93%)      5 
Split 5     7,596 (38.0%)     41.1%        0.55 11,520 ( 91%)      6 
Split 6     7,979 (39.9%)     25.6%        0.56 11,309 ( 89%)     10 
Split 7     5,768 (28.8%)     38.9%        0.59 11,313 ( 89%)      6 
Split 8     9,787 (48.9%)     42.0%        0.58 11,037 ( 87%)      5 
Split 9     9,117 (45.6%)     54.3%        0.58 11,313 ( 89%)      3 
Split 10    8,686 (43.4%)     29.9%        0.59 11,272 ( 89%)      7 
Split 11   11,581 (57.9%)     43.3%        0.59 11,266 ( 89%)      4 
Split 12   10,268 (51.3%)     48.5%        0.59 11,309 ( 89%)      3 
Split 13    6,396 (32.0%)     29.1%        0.60 11,235 ( 89%)      6 
Split 14    9,415 (47.1%)     24.0%        0.60 11,051 ( 87%)      7 
Split 15    7,093 (35.5%)     35.5%        0.77 10,544 ( 83%)      4 
Split 16    9,186 (45.9%)     40.4%        0.79 10,508 ( 83%)      3 
Split 17    9,160 (45.8%)     46.0%        0.82 10,623 ( 84%)      2 
Split 18    8,660 (43.3%)     43.4%        0.83 10,385 ( 82%)      2 
Split 19    8,986 (44.9%)     40.9%        0.85 10,435 ( 83%)      2 
Split 20    8,885 (44.4%)     22.2%        0.87 10,267 ( 81%)      5 
Split 21    7,042 (35.2%)     30.0%        0.89 10,219 ( 81%)      3 
Split 22    7,321 (36.6%)     32.6%        0.89  9,969 ( 79%)      2 
Split 23    9,186 (45.9%)     15.5%        0.87  9,788 ( 77%)      5 
Split 24    7,257 (36.3%)     16.5%        0.85  9,694 ( 77%)      4 
Split 25    8,487 (42.4%)     13.2%        0.81  9,530 ( 75%)      4 
Split 26    8,603 (43.0%)      3.4%        0.75  9,003 ( 71%)      6 
Resample    7,609 (38.0%)       NA%        0.83 10,245 ( 81%)     NA 

Sampling diagnostics for SMC run 2 of 2 (20,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd   Max. unique Est. k 
Split 1    17,637 (88.2%)     22.3%        0.38 12,655 (100%)     13 
Split 2    14,759 (73.8%)     31.3%        0.44 12,144 ( 96%)      9 
Split 3    11,481 (57.4%)     43.0%        0.51 12,005 ( 95%)      6 
Split 4    12,204 (61.0%)     33.0%        0.53 11,704 ( 93%)      8 
Split 5     8,715 (43.6%)     36.4%        0.55 11,626 ( 92%)      7 
Split 6    12,089 (60.4%)     45.4%        0.55 11,406 ( 90%)      5 
Split 7     8,473 (42.4%)     38.5%        0.56 11,401 ( 90%)      6 
Split 8    11,684 (58.4%)     26.3%        0.57 11,352 ( 90%)      9 
Split 9     7,604 (38.0%)     35.7%        0.58 11,312 ( 89%)      6 
Split 10    9,205 (46.0%)     39.1%        0.59 11,252 ( 89%)      5 
Split 11   10,376 (51.9%)     37.0%        0.59 11,242 ( 89%)      5 
Split 12   10,479 (52.4%)     35.4%        0.61 11,221 ( 89%)      5 
Split 13   10,653 (53.3%)     39.1%        0.60 11,269 ( 89%)      4 
Split 14    8,006 (40.0%)     44.0%        0.60 11,150 ( 88%)      3 
Split 15    9,262 (46.3%)     49.9%        0.76 10,531 ( 83%)      2 
Split 16    8,906 (44.5%)     34.1%        0.81 10,520 ( 83%)      4 
Split 17    7,781 (38.9%)     26.7%        0.84 10,436 ( 83%)      5 
Split 18    7,805 (39.0%)     30.2%        0.87 10,276 ( 81%)      4 
Split 19    6,385 (31.9%)     28.4%        0.89 10,205 ( 81%)      4 
Split 20    6,369 (31.8%)     32.3%        0.88 10,124 ( 80%)      3 
Split 21    5,909 (29.5%)     36.5%        0.90 10,063 ( 80%)      2 
Split 22    6,900 (34.5%)     32.6%        0.87  9,840 ( 78%)      2 
Split 23    7,381 (36.9%)     28.3%        0.87  9,747 ( 77%)      2 
Split 24    6,028 (30.1%)     24.0%        0.85  9,490 ( 75%)      2 
Split 25    6,781 (33.9%)     19.1%        0.80  9,152 ( 72%)      2 
Split 26    9,092 (45.5%)      7.4%        0.73  8,577 ( 68%)      2 
Resample    8,196 (41.0%)       NA%        0.83 10,427 ( 82%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log
weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be
between 1 and 1.05.
• SMC convergence: Increase the number of samples. If you are experiencing low plan diversity or bottlenecks
as well, address those issues first.

5,000 plans (thinned)

image
SMC: 5,000 sampled plans of 27 districts on 14,926 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.95
`est_label_mult`=1 • `pop_temper`=0.001

Plan diversity 80% range: 0.82 to 0.96

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white      pop_black 
      1.057204       1.015243       1.026230       1.033367       1.008623       1.076971       1.059918 
      pop_hisp       pop_aian      pop_asian       pop_nhpi      pop_other        pop_two      vap_white 
      1.079270       1.063427       1.056164       1.010298       1.011788       1.048506       1.068323 
     vap_black       vap_hisp       vap_aian      vap_asian       vap_nhpi      vap_other        vap_two 
      1.060618       1.083367       1.060725       1.058351       1.016829       1.010043       1.045474 
pre_16_dem_cli pre_16_rep_tru pre_20_dem_bid pre_20_rep_tru uss_16_dem_sch uss_16_rep_lon uss_18_dem_gil 
      1.002666       1.095105       1.002145       1.094742       1.002692       1.070332       1.002164 
uss_18_rep_far gov_18_dem_cuo gov_18_rep_mol atg_18_dem_jam atg_18_rep_wof         adv_16         adv_18 
      1.073941       1.002275       1.077567       1.002090       1.072197       1.000183       1.001720 
        adv_20         arv_16         arv_18         arv_20  county_splits    muni_splits            ndv 
      1.002145       1.084494       1.074847       1.094742       1.096600       1.018751       1.001040 
           nrv        ndshare        e_dvs.x       pr_dem.x        e_dem.x        pbias.x         egap.x 
      1.085177       1.067367       1.069494       1.072567       1.002516       1.050635       1.004869 
       e_dvs.y       pr_dem.y        e_dem.y        pbias.y         egap.y 
      1.069494       1.072567       1.002516       1.050635       1.004869 
✖ WARNING: SMC runs have not converged.

Sampling diagnostics for SMC run 1 of 2 (20,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd   Max. unique Est. k 
Split 1    17,649 (88.2%)     19.4%        0.38 12,676 (100%)     15 
Split 2    15,033 (75.2%)     31.4%        0.45 12,197 ( 96%)      9 
Split 3    14,440 (72.2%)     38.3%        0.52 11,923 ( 94%)      7 
Split 4    12,334 (61.7%)     47.5%        0.53 11,737 ( 93%)      5 
Split 5     7,596 (38.0%)     41.1%        0.55 11,520 ( 91%)      6 
Split 6     7,979 (39.9%)     25.6%        0.56 11,309 ( 89%)     10 
Split 7     5,768 (28.8%)     38.9%        0.59 11,313 ( 89%)      6 
Split 8     9,787 (48.9%)     42.0%        0.58 11,037 ( 87%)      5 
Split 9     9,117 (45.6%)     54.3%        0.58 11,313 ( 89%)      3 
Split 10    8,686 (43.4%)     29.9%        0.59 11,272 ( 89%)      7 
Split 11   11,581 (57.9%)     43.3%        0.59 11,266 ( 89%)      4 
Split 12   10,268 (51.3%)     48.5%        0.59 11,309 ( 89%)      3 
Split 13    6,396 (32.0%)     29.1%        0.60 11,235 ( 89%)      6 
Split 14    9,415 (47.1%)     24.0%        0.60 11,051 ( 87%)      7 
Split 15    7,093 (35.5%)     35.5%        0.77 10,544 ( 83%)      4 
Split 16    9,186 (45.9%)     40.4%        0.79 10,508 ( 83%)      3 
Split 17    9,160 (45.8%)     46.0%        0.82 10,623 ( 84%)      2 
Split 18    8,660 (43.3%)     43.4%        0.83 10,385 ( 82%)      2 
Split 19    8,986 (44.9%)     40.9%        0.85 10,435 ( 83%)      2 
Split 20    8,885 (44.4%)     22.2%        0.87 10,267 ( 81%)      5 
Split 21    7,042 (35.2%)     30.0%        0.89 10,219 ( 81%)      3 
Split 22    7,321 (36.6%)     32.6%        0.89  9,969 ( 79%)      2 
Split 23    9,186 (45.9%)     15.5%        0.87  9,788 ( 77%)      5 
Split 24    7,257 (36.3%)     16.5%        0.85  9,694 ( 77%)      4 
Split 25    8,487 (42.4%)     13.2%        0.81  9,530 ( 75%)      4 
Split 26    8,603 (43.0%)      3.4%        0.75  9,003 ( 71%)      6 
Resample    7,609 (38.0%)       NA%        0.83 10,245 ( 81%)     NA 

Sampling diagnostics for SMC run 2 of 2 (20,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd   Max. unique Est. k 
Split 1    17,637 (88.2%)     22.3%        0.38 12,655 (100%)     13 
Split 2    14,759 (73.8%)     31.3%        0.44 12,144 ( 96%)      9 
Split 3    11,481 (57.4%)     43.0%        0.51 12,005 ( 95%)      6 
Split 4    12,204 (61.0%)     33.0%        0.53 11,704 ( 93%)      8 
Split 5     8,715 (43.6%)     36.4%        0.55 11,626 ( 92%)      7 
Split 6    12,089 (60.4%)     45.4%        0.55 11,406 ( 90%)      5 
Split 7     8,473 (42.4%)     38.5%        0.56 11,401 ( 90%)      6 
Split 8    11,684 (58.4%)     26.3%        0.57 11,352 ( 90%)      9 
Split 9     7,604 (38.0%)     35.7%        0.58 11,312 ( 89%)      6 
Split 10    9,205 (46.0%)     39.1%        0.59 11,252 ( 89%)      5 
Split 11   10,376 (51.9%)     37.0%        0.59 11,242 ( 89%)      5 
Split 12   10,479 (52.4%)     35.4%        0.61 11,221 ( 89%)      5 
Split 13   10,653 (53.3%)     39.1%        0.60 11,269 ( 89%)      4 
Split 14    8,006 (40.0%)     44.0%        0.60 11,150 ( 88%)      3 
Split 15    9,262 (46.3%)     49.9%        0.76 10,531 ( 83%)      2 
Split 16    8,906 (44.5%)     34.1%        0.81 10,520 ( 83%)      4 
Split 17    7,781 (38.9%)     26.7%        0.84 10,436 ( 83%)      5 
Split 18    7,805 (39.0%)     30.2%        0.87 10,276 ( 81%)      4 
Split 19    6,385 (31.9%)     28.4%        0.89 10,205 ( 81%)      4 
Split 20    6,369 (31.8%)     32.3%        0.88 10,124 ( 80%)      3 
Split 21    5,909 (29.5%)     36.5%        0.90 10,063 ( 80%)      2 
Split 22    6,900 (34.5%)     32.6%        0.87  9,840 ( 78%)      2 
Split 23    7,381 (36.9%)     28.3%        0.87  9,747 ( 77%)      2 
Split 24    6,028 (30.1%)     24.0%        0.85  9,490 ( 75%)      2 
Split 25    6,781 (33.9%)     19.1%        0.80  9,152 ( 72%)      2 
Split 26    9,092 (45.5%)      7.4%        0.73  8,577 ( 68%)      2 
Resample    8,196 (41.0%)       NA%        0.83 10,427 ( 82%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log
weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be
between 1 and 1.05.
• SMC convergence: Increase the number of samples. If you are experiencing low plan diversity or bottlenecks
as well, address those issues first.

Checklist

@christopherkenny