alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 Nebraska Congressional Districts #96

Closed CoryMcCartan closed 2 years ago

CoryMcCartan commented 2 years ago

Redistricting requirements

In Nebraska, districts must, under a legislative resolution:

  1. be contiguous
  2. have equal populations (specifically, within 0.5% of equality)
  3. be geographically compact
  4. preserve county and municipality boundaries as much as possible
  5. preserve the cores of prior districts
  6. not be drawn using partisan information

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a county constraint. We preprocess the map to ensure the cores of prior districts are preserved, as described below.

Data Sources

Data for Nebraska comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

To preserve the cores of prior districts, we merge all precincts which are more than two precincts away from a district border, under the 2010 plan. Precincts in counties which are split by existing district boundaries are merged only within their county.

Simulation Notes

We sample 5,000 districting plans for Nebraska across four runs of the SMC algorithm. In addition to a county constraint applied to the residual counties left over from the cores operation, we apply an additional Gibbs constraint of strength 2 to avoid splitting counties.

Validation

image

SMC: 5,000 sampled plans of 3 districts on 1,402 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.13 to 0.43
✖ WARNING: Low plan diversity

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white 
       1.00193        1.00300        1.00084        1.00125        1.00083        0.99979        1.00117 
     pop_black       pop_aian      pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp 
       1.00111        1.00140        1.00090        1.00018        1.00100        1.00102        0.99983 
     vap_white      vap_black       vap_aian      vap_asian       vap_nhpi      vap_other        vap_two 
       1.00182        1.00093        1.00209        1.00088        1.00022        1.00209        1.00042 
pre_16_rep_tru pre_16_dem_cli uss_18_rep_fis uss_18_dem_ray gov_18_rep_ric gov_18_dem_kri atg_18_rep_pet 
       1.00100        0.99995        1.00120        1.00027        1.00104        1.00003        1.00176 
sos_18_rep_evn sos_18_dem_dan pre_20_rep_tru pre_20_dem_bid uss_20_rep_sas uss_20_dem_jan         arv_16 
       1.00127        1.00022        1.00132        1.00093        1.00145        1.00014        1.00100 
        adv_16         arv_18         adv_18         arv_20         adv_20  county_splits    muni_splits 
       0.99995        1.00132        1.00014        1.00118        1.00064        1.00102        1.00070 
           ndv            nrv        ndshare          e_dvs          e_dem          pbias           egap 
       1.00000        1.00117        1.00061        1.00063        1.00077        0.99996        1.00140 

Sampling diagnostics for SMC run 1 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,239 (99.1%)      5.7%        0.20   778 ( 98%)      8 
Split 2     1,196 (95.7%)      3.5%        0.36   709 ( 90%)      5 
Resample      903 (72.2%)       NA%        0.34   756 ( 96%)     NA 

Sampling diagnostics for SMC run 2 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,237 (99.0%)      5.7%        0.21   792 (100%)      8 
Split 2     1,199 (96.0%)      3.7%        0.37   700 ( 89%)      5 
Resample    1,013 (81.0%)       NA%        0.36   757 ( 96%)     NA 

Sampling diagnostics for SMC run 3 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,238 (99.0%)      6.5%        0.20   778 ( 98%)      7 
Split 2     1,209 (96.7%)      4.3%        0.34   690 ( 87%)      4 
Resample    1,039 (83.1%)       NA%        0.32   775 ( 98%)     NA 

Sampling diagnostics for SMC run 4 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,238 (99.1%)      4.6%        0.20   807 (102%)     10 
Split 2     1,212 (97.0%)      2.9%        0.33   705 ( 89%)      6 
Resample    1,065 (85.2%)       NA%        0.31   784 ( 99%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the
log weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics
should be between 1 and 1.05.
• Low diversity: Check for potential bottlenecks. Increase the number of samples. Examine the diversity
plot with `hist(plans_diversity(plans), breaks=24)`. Consider weakening or removing constraints, or
increasing the population tolerance. If the accpetance rate drops quickly in the final splits, try
increasing `pop_temper` by 0.01.

NOTE: Low diversity warning is spurious. Cores constraint limits max VI distance. Did 4 independent runs rather than 2 to maximize diversity.

Checklist

@christopherkenny