Re-run 2020 Massachusetts Congressional Districts

Redistricting requirements

In Massachusetts, districts must:

be contiguous
have equal populations

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We use the basic algorithmic county constraint applied to pseudo counties, as Congressional plans in MA do seem to follow county and municipal boundaries, despite no legal constraint. Pseudo counties are constructed by following municipal boundaries in counties larger than a district and county lines.

Data Sources

Data for Massachusetts comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 5,000 districting plans for Massachusetts across 2 independent runs of the SMC algorithm. No special techniques were needed to produce the sample.

Validation

validation_20220622_1018

SMC: 5,000 sampled plans of 9 districts on 2,157 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.50 to 0.77

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white      pop_black       pop_aian 
      1.002150       1.000553       1.000111       1.001274       1.008911       1.005690       1.002411       1.001368       1.003956 
     pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp      vap_white      vap_black       vap_aian      vap_asian 
      1.001103       1.008611       1.003410       1.005894       1.007066       1.004120       1.001463       1.003263       1.001391 
      vap_nhpi      vap_other        vap_two pre_16_dem_cli pre_16_rep_tru uss_18_dem_war uss_18_rep_die gov_18_rep_bak gov_18_dem_gon 
      1.015456       1.004568       1.005248       1.008218       1.008624       1.008301       1.006479       1.004302       1.006965 
atg_18_dem_hea atg_18_rep_mcm pre_20_dem_bid pre_20_rep_tru uss_20_dem_mar uss_20_rep_oco         arv_16         adv_16         arv_18 
      1.007778       1.007903       1.007205       1.010664       1.005903       1.008785       1.008624       1.008218       1.005524 
        adv_18         arv_20         adv_20  county_splits    muni_splits            ndv            nrv        ndshare          e_dvs 
      1.008336       1.009318       1.006898       1.000273       1.010312       1.008313       1.008281       1.007032       1.006904 
         e_dem          pbias           egap 
      1.017826       1.005076       1.016740 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,388 (95.5%)     13.5%        0.46 1,615 (102%)      9 
Split 2     2,354 (94.2%)     17.5%        0.45 1,579 (100%)      6 
Split 3     2,356 (94.2%)     21.6%        0.45 1,559 ( 99%)      4 
Split 4     2,345 (93.8%)     25.3%        0.49 1,525 ( 97%)      3 
Split 5     2,309 (92.4%)     29.9%        0.52 1,536 ( 97%)      2 
Split 6     2,266 (90.6%)     24.2%        0.57 1,524 ( 96%)      2 
Split 7     2,304 (92.2%)     17.6%        0.54 1,449 ( 92%)      2 
Split 8     2,345 (93.8%)      5.7%        0.49 1,244 ( 79%)      2 
Resample    1,917 (76.7%)       NA%        0.50 1,488 ( 94%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,382 (95.3%)     15.1%        0.47 1,566 ( 99%)      8 
Split 2     2,352 (94.1%)     17.4%        0.45 1,548 ( 98%)      6 
Split 3     2,349 (94.0%)     21.8%        0.46 1,568 ( 99%)      4 
Split 4     2,350 (94.0%)     19.3%        0.48 1,539 ( 97%)      4 
Split 5     2,316 (92.7%)     21.9%        0.53 1,558 ( 99%)      3 
Split 6     2,282 (91.3%)     17.5%        0.58 1,482 ( 94%)      3 
Split 7     2,343 (93.7%)      9.6%        0.51 1,442 ( 91%)      4 
Split 8     2,346 (93.8%)      4.1%        0.49 1,261 ( 80%)      3 
Resample    1,942 (77.7%)       NA%        0.50 1,489 ( 94%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3 or so), and low
numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.

Checklist

[x] I have followed the instructions
[x] I have updated the tracker
[x] All TODO lines from the template code have been removed
[x] I have merged in the master branch and then recalculated summary statistics
[x] I have run enforce_style() to format my code
[x] The documentation copied above is up-to-date
[x] There are no data files in this pull request
[x] None of the file output paths (for the redist_map and redist_plans objects, and summary statistics) have been edited

@CoryMcCartan

alarm-redist / fifty-states