alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 Pennsylvania Congressional Districts #98

Closed CoryMcCartan closed 2 years ago

CoryMcCartan commented 2 years ago

Redistricting requirements

In Pennsylvania, there are few formal districting requirements, but districts must generally:

  1. be contiguous
  2. have equal populations
  3. be geographically compact
  4. preserve county and municipality boundaries as much as possible

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a county/municipality constraint, as described below.

Data Sources

Data for Pennsylvania comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 10,000 districting plans for Pennsylvania across two runs of the SMC algorithm, then filter down to 5,000 total plans. To balance county and municipality splits, we create pseudocounties for use in the county constraint. These are counties, outside of Allegheny County, Montgomery County, and Philadelphia County. Within Allegheny County, Montgomery County, and Philadelphia County, each municipality is its own pseudocounty as well. These counties were chosen since they are necessarily split by congressional districts. We also apply an additional Gibbs constraint to further avoid splitting municipalities.

Validation

image

Performance: image

SMC: 5,000 sampled plans of 17 districts on 9,178 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0.02

Plan diversity 80% range: 0.64 to 0.83

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white 
       1.00228        1.00001        1.00120        1.01284        1.04059        1.00331        1.00470 
     pop_black       pop_aian      pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp 
       1.00920        1.00530        1.02454        1.05424        1.01004        1.00273        1.00714 
     vap_white      vap_black       vap_aian      vap_asian       vap_nhpi      vap_other        vap_two 
       1.00486        1.01070        1.00614        1.02389        1.01650        1.00496        1.01967 
pre_16_dem_cli pre_16_rep_tru uss_16_dem_mcg uss_16_rep_too atg_16_dem_sha atg_16_rep_raf uss_18_dem_cas 
       1.00445        0.99985        1.00583        1.00018        1.00512        1.00256        1.00883 
uss_18_rep_bar gov_18_dem_wol gov_18_rep_wag         arv_16         adv_16         arv_18         adv_18 
       1.00969        1.00777        1.00270        1.00063        1.00432        1.00606        1.00846 
 county_splits    muni_splits            ndv            nrv        ndshare          e_dvs         pr_dem 
       1.00077        1.04568        1.00365        1.00080        1.00173        1.00163        1.00156 
         e_dem          pbias           egap 
       1.01976        1.00438        1.01516 
✖ WARNING: SMC runs have not converged.

Sampling diagnostics for SMC run 1 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     4,830 (96.6%)     17.1%        0.37 3,175 (100%)     13 
Split 2     4,753 (95.1%)     25.5%        0.42 3,098 ( 98%)      8 
Split 3     4,727 (94.5%)     36.4%        0.47 3,117 ( 99%)      5 
Split 4     4,675 (93.5%)     40.2%        0.52 3,097 ( 98%)      4 
Split 5     4,580 (91.6%)     45.7%        0.58 3,126 ( 99%)      3 
Split 6     4,502 (90.0%)     36.6%        0.62 3,121 ( 99%)      4 
Split 7     4,435 (88.7%)     34.2%        0.66 3,042 ( 96%)      4 
Split 8     4,433 (88.7%)     38.7%        0.69 3,073 ( 97%)      3 
Split 9     4,428 (88.6%)     43.7%        0.69 3,053 ( 97%)      2 
Split 10    4,435 (88.7%)     40.8%        0.67 3,046 ( 96%)      2 
Split 11    4,438 (88.8%)     38.5%        0.67 3,014 ( 95%)      2 
Split 12    4,435 (88.7%)     19.1%        0.63 2,989 ( 95%)      5 
Split 13    4,371 (87.4%)     14.2%        0.65 2,920 ( 92%)      6 
Split 14    4,311 (86.2%)     17.6%        0.65 2,883 ( 91%)      4 
Split 15    4,265 (85.3%)     16.9%        0.68 2,788 ( 88%)      3 
Split 16    4,259 (85.2%)      6.0%        0.70 2,545 ( 81%)      3 
Resample    2,416 (48.3%)       NA%        0.88 2,545 ( 81%)     NA 

Sampling diagnostics for SMC run 2 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     4,824 (96.5%)     20.0%        0.38 3,160 (100%)     11 
Split 2     4,760 (95.2%)     28.6%        0.43 3,124 ( 99%)      7 
Split 3     4,742 (94.8%)     31.8%        0.46 3,115 ( 99%)      6 
Split 4     4,705 (94.1%)     40.5%        0.50 3,082 ( 98%)      4 
Split 5     4,650 (93.0%)     38.7%        0.54 3,063 ( 97%)      4 
Split 6     4,556 (91.1%)     23.0%        0.59 3,079 ( 97%)      7 
Split 7     4,414 (88.3%)     34.8%        0.64 3,054 ( 97%)      4 
Split 8     4,414 (88.3%)     17.8%        0.67 3,056 ( 97%)      8 
Split 9     4,437 (88.7%)     25.0%        0.66 3,001 ( 95%)      5 
Split 10    4,438 (88.8%)     33.2%        0.64 3,030 ( 96%)      3 
Split 11    4,389 (87.8%)     25.1%        0.65 3,050 ( 97%)      4 
Split 12    4,321 (86.4%)     28.4%        0.65 2,957 ( 94%)      3 
Split 13    4,406 (88.1%)     25.1%        0.60 2,997 ( 95%)      3 
Split 14    4,318 (86.4%)     21.2%        0.59 2,890 ( 91%)      3 
Split 15    4,205 (84.1%)     16.7%        0.65 2,837 ( 90%)      3 
Split 16    4,220 (84.4%)      7.7%        0.65 2,509 ( 79%)      2 
Resample    2,274 (45.5%)       NA%        0.84 2,583 ( 82%)     NA 

Checklist

@christopherkenny

christopherkenny commented 2 years ago

Thoughts on what's going on with municipalities here? It seems just a tad high. A reduction of 3 or so would set it more squarely in the middle 90% of the sample.

CoryMcCartan commented 2 years ago

We have a Gibbs constraint going but had to weaken it to get convergence. But we are more in the middle for county splits as a result (cf https://github.com/alarm-redist/fifty-states/pull/68). On balance no worse than the last sample I think

christopherkenny commented 2 years ago

Fair point, this does balance county splits better here. Not much to be done (absent ad-hoc merging) on that balance. This also has much improved diversity with this balancing.