alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 Iowa Congressional Districts #92

Closed CoryMcCartan closed 2 years ago

CoryMcCartan commented 2 years ago

Redistricting requirements

In Iowa, districts must:

  1. be contiguous
  2. have equal populations
  3. be constructed only from counties
  4. be geographically compact, as defined by two compactness measures:
    1. length-width compactness, which measures the total absolute difference between the length and width of a district, across all districts
    2. perimeter compactness, which measures the total perimeter of all districts

Interpretation of requirements

We enforce a maximum population deviation of 0.01%, given strict historical deviation standards. We also merge VTDs into counties and run the simulation at the county level. For compactness, we increase the compactness parameter to 1.1, which does not create too much inefficiency.

Data Sources

Data for Iowa comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 5,000 districting plans for Iowa across two independent runs of the SMC algorithm. As noted above, we set compactness=1.1.

Validation

image

Note: VI spike at 0 because of a low total number of configurations with this low tolerance.

SMC: 5,000 sampled plans of 4 districts on 99 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.9
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.23 to 0.78

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white      pop_black       pop_aian      pop_asian 
        1.0002         1.0076         1.0027         1.0046         1.0274         1.0010         1.0118         1.0050         1.0030         1.0108 
      pop_nhpi      pop_other        pop_two       vap_hisp      vap_white      vap_black       vap_aian      vap_asian       vap_nhpi      vap_other 
        1.0005         1.0105         1.0066         1.0012         1.0136         1.0050         1.0001         1.0112         1.0014         1.0120 
       vap_two pre_16_rep_tru pre_16_dem_cli uss_16_rep_gra uss_16_dem_jud gov_18_rep_rey gov_18_dem_hub atg_18_dem_mil sos_18_rep_pat sos_18_dem_dej 
        1.0061         1.0056         1.0054         1.0123         1.0054         1.0116         1.0060         1.0173         1.0095         1.0041 
pre_20_rep_tru pre_20_dem_bid uss_20_rep_ern uss_20_dem_gre         arv_16         adv_16         arv_18         adv_18         arv_20         adv_20 
        1.0157         1.0060         1.0160         1.0057         1.0111         1.0051         1.0103         1.0057         1.0155         1.0062 
           ndv            nrv        ndshare        comp_lw           area     comp_perim          e_dvs         pr_dem          e_dem          pbias 
        1.0068         1.0113         1.0055         1.0251         1.0057         1.0092         1.0055         1.0011         1.0018         1.0017 
          egap 
        1.0026 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,105 (44.2%)      0.1%        1.27 1,580 (100%)      3 
Split 2       907 (36.3%)      0.1%        0.81 1,000 ( 63%)      2 
Split 3       637 (25.5%)      0.0%        0.80   672 ( 43%)      2 
Resample      543 (21.7%)       NA%        0.81 1,161 ( 73%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,104 (44.2%)      0.3%        1.26 1,559 ( 99%)      1 
Split 2       905 (36.2%)      0.2%        0.87 1,023 ( 65%)      1 
Split 3       923 (36.9%)      0.1%        0.79   665 ( 42%)      1 
Resample    1,340 (53.6%)       NA%        0.79 1,364 ( 86%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so), and low
numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.

Checklist

@christopherkenny

christopherkenny commented 2 years ago

Looks good!