alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 Oklahoma Congressional Districts #108

Closed christopherkenny closed 2 years ago

christopherkenny commented 2 years ago

Redistricting requirements

In Oklahoma, districts must:

  1. be contiguous (C)
  2. have equal populations (A.2)
  3. be geographically compact (C)
  4. preserve county and municipality boundaries as much as possible (C)

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a county/municipality constraint, as described below.

Data Sources

Data for Oklahoma comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 5,000 districting plans for Oklahoma across 2 independent runs of the SMC algorithm. We use a pseudo county constraint which uses counties, except for Oklahoma County which uses municipalities. No special techniques were needed to produce the sample.

Validation

validation_20220622_0105

SMC: 5,000 sampled plans of 5 districts on 1,947 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.59 to 0.91

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white      pop_black       pop_aian 
     1.0026971      1.0016413      0.9999266      1.0000279      1.0025800      1.0012320      1.0037122      1.0072595      1.0007859 
     pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp      vap_white      vap_black       vap_aian      vap_asian 
     1.0035005      1.0008942      1.0004423      1.0002995      1.0010506      1.0035990      1.0075433      1.0005558      1.0023831 
      vap_nhpi      vap_other        vap_two pre_16_rep_tru pre_16_dem_cli uss_16_rep_lan uss_16_dem_wor gov_18_rep_sti gov_18_dem_edm 
     1.0010786      1.0000068      1.0000888      1.0005939      1.0011868      1.0010260      1.0034591      1.0007339      1.0004562 
atg_18_rep_hun atg_18_dem_myl pre_20_rep_tru pre_20_dem_bid uss_20_rep_inh uss_20_dem_bro         arv_16         adv_16         arv_18 
     1.0011417      1.0017219      1.0005910      1.0010144      1.0006824      1.0019222      1.0009923      1.0021257      1.0009233 
        adv_18         arv_20         adv_20  county_splits    muni_splits            ndv            nrv        ndshare          e_dvs 
     1.0009864      1.0006110      1.0012889      1.0013199      1.0016048      1.0011041      1.0008314      1.0022144      1.0021898 
         e_dem          pbias           egap 
     0.9999422      1.0005190      1.0017663 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,454 (98.2%)     11.8%        0.27 1,588 (100%)      7 
Split 2     2,416 (96.6%)     15.6%        0.38 1,559 ( 99%)      5 
Split 3     2,386 (95.4%)     22.4%        0.42 1,524 ( 96%)      3 
Split 4     2,352 (94.1%)     10.0%        0.47 1,368 ( 87%)      2 
Resample    1,871 (74.8%)       NA%        0.46 1,470 ( 93%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,454 (98.2%)      8.4%        0.27 1,567 ( 99%)     10 
Split 2     2,413 (96.5%)     13.4%        0.38 1,566 ( 99%)      6 
Split 3     2,335 (93.4%)     14.5%        0.49 1,517 ( 96%)      5 
Split 4     2,304 (92.2%)      7.6%        0.51 1,400 ( 89%)      3 
Resample    1,637 (65.5%)       NA%        0.50 1,444 ( 91%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so), and low
numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.

Checklist

@CoryMcCartan