alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 Maine Congressional Districts #106

Closed christopherkenny closed 2 years ago

christopherkenny commented 2 years ago

Redistricting requirements

In Maine, following Title 21-A, Chapter 15, Section 1206, districts must:

  1. be contiguous (1)
  2. have equal populations (1)
  3. be geographically compact (1)
  4. preserve county and municipality boundaries as much as possible (1)

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply the standard algorithmic county constraint.

Data Sources

Data for Maine comes from the Voting and Election Science Team for 2016, 2018, and 2020. It is retabulated to 2020 Census tracts, as 2020 Census VTDs do not cover the majority of Maine's geography.

Pre-processing Notes

Islands tracts were connected to the nearest tract within the same district.

Simulation Notes

We sample 5,000 districting plans for Maine, across 4 independent runs of the SMC algorithm. We use the standard county constraint. We weaken the compactness parameter to 0.9 due to the relatively small state size and total number of tracts to encourage more diversity in the sample.

Validation

validation_20220622_0032

SMC: 5,000 sampled plans of 2 districts on 401 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.016 to 0.483
✖ WARNING: Low plan diversity

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white      pop_black       pop_hisp       pop_aian 
     1.0013128      1.0011930      1.0012554      1.0001128      1.0009325      1.0004023      1.0000092      0.9999668      1.0021643 
     pop_asian       pop_nhpi      pop_other        pop_two      vap_white      vap_black       vap_hisp       vap_aian      vap_asian 
     1.0017337      1.0025655      1.0022458      1.0000240      1.0004747      1.0000503      1.0008113      1.0023318      1.0019800 
      vap_nhpi      vap_other        vap_two pre_16_dem_cli pre_16_rep_tru gov_18_dem_mil gov_18_rep_moo uss_18_dem_rin uss_18_rep_bra 
     1.0021021      1.0016257      0.9999838      1.0008052      1.0008569      1.0007473      1.0011623      1.0009893      1.0016979 
pre_20_dem_bid pre_20_rep_tru uss_20_dem_gid uss_20_rep_col         arv_16         adv_16         arv_18         adv_18         arv_20 
     1.0009598      1.0003298      1.0008160      1.0005918      1.0008569      1.0008052      1.0014489      1.0004953      1.0004344 
        adv_20  county_splits    muni_splits            ndv            nrv        ndshare          e_dvs         pr_dem          e_dem 
     1.0008716      1.0011363      1.0002822      1.0008561      1.0005056      1.0004675      1.0004234      1.0013797      1.0001447 
          egap 
     1.0018267 

Sampling diagnostics for SMC run 1 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,096 (87.7%)      9.9%        0.56   779 ( 99%)      5 
Resample      791 (63.2%)       NA%        0.56   715 ( 90%)     NA 

Sampling diagnostics for SMC run 2 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,113 (89.1%)     10.3%        0.52   785 ( 99%)      5 
Resample      846 (67.7%)       NA%        0.52   723 ( 92%)     NA 

Sampling diagnostics for SMC run 3 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     1,080 (86.4%)     10.3%        0.56   780 ( 99%)      5 
Resample      776 (62.1%)       NA%        0.56   705 ( 89%)     NA 

Sampling diagnostics for SMC run 4 of 4 (1,250 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1       996 (79.7%)     10.0%        0.57   805 (102%)      5 
Resample      773 (61.8%)       NA%        0.57   699 ( 88%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so),
and low numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.
• Low diversity: Check for potential bottlenecks. Increase the number of samples. Examine the diversity plot with
`hist(plans_diversity(plans), breaks=24)`. Consider weakening or removing constraints, or increasing the population tolerance. If the
accpetance rate drops quickly in the final splits, try increasing `pop_temper` by 0.01.

Checklist

@CoryMcCartan

Note: 2 district state, so I ran 4 runs of 1250 for diversity.