alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

2020 Missouri Congressional Districts #53

Closed christopherkenny closed 2 years ago

christopherkenny commented 2 years ago

Redistricting requirements

In Missouri, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a basic county constraint to be in line with the splits in the plan, though there is no legal requirement. We add a VRA constraint targeting one BVAP opportunity district.

Data Sources

Data for Missouri comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 5,000 districting plans for Missouri. We use a standard algorithmic county constraint. No special techniques were needed to produce the sample.

Validation

validation_20220128_1629

Checklist

@CoryMcCartan

Additional Notes:

Here is a performance plot for BVAP. image

CoryMcCartan commented 2 years ago

Looks good. While we wait for the final plan, let's over-generate (maybe 6k instead of 5k) and discard those few with minority VAP < 50%, then subsample to a final 5k. MO-1 is an MMD if not a Black MD and seems reasonable to enforce that.

christopherkenny commented 2 years ago

Redistricting requirements

In Missouri, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a basic county constraint to be in line with the splits in the plan, though there is no legal requirement. We add a VRA constraint targeting one BVAP opportunity district.

Data Sources

Data for Missouri comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 6,000 districting plans for Missouri and subset to 5,000 which contain at least one majority minority district. We use a standard algorithmic county constraint. No special techniques were needed to produce the sample.

Validation

validation_20220202_1107

Extra Plot

image

@CoryMcCartan

CoryMcCartan commented 2 years ago

Great!

christopherkenny commented 2 years ago

Redistricting requirements

In Missouri, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a basic county constraint to be in line with the splits in the plan, though there is no legal requirement. We add a VRA constraint targeting one BVAP opportunity district.

Data Sources

Data for Missouri comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 20,000 districting plans for New Jersey across two independent runs of the SMC algorithm, and then thin the sample to down to 5,000 plans. We use a standard algorithmic county constraint. No special techniques were needed to produce the sample.

Validation

validation_20220613_1157

Additional Notes

image @CoryMcCartan

CoryMcCartan commented 2 years ago

also, summary(plans) output?

christopherkenny commented 2 years ago

Redistricting requirements

In Missouri, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a basic county constraint to be in line with the splits in the plan, though there is no legal requirement. We add a VRA constraint targeting one BVAP opportunity district.

Data Sources

Data for Missouri comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 10,000 districting plans for Missouri across two independent runs of the SMC algorithm, and then thin the sample to down to 5,000 plans. We use a standard algorithmic county constraint. No special techniques were needed to produce the sample.

validation plot

validation_20220620_1759

summary output

SMC: 5,000 sampled plans of 8 districts on 4,604 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.95
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.47 to 0.73

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white      pop_black 
      1.011086       1.003021       1.004126       1.018469       1.001638       1.005015       1.018586       1.013368 
      pop_aian      pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp      vap_white      vap_black 
      1.002968       1.013584       1.007631       1.002935       1.008035       1.006255       1.010121       1.011582 
      vap_aian      vap_asian       vap_nhpi      vap_other        vap_two pre_16_rep_tru pre_16_dem_cli uss_16_rep_blu 
      1.009598       1.012810       1.007415       1.009529       1.020016       1.012339       1.007639       1.012887 
uss_16_dem_kan gov_16_rep_gre gov_16_dem_kos atg_16_rep_haw atg_16_dem_hen sos_16_rep_ash sos_16_dem_smi uss_18_rep_haw 
      1.014356       1.013625       1.011240       1.011824       1.012727       1.011692       1.014208       1.013098 
uss_18_dem_mcc pre_20_rep_tru pre_20_dem_bid gov_20_rep_par gov_20_dem_gal atg_20_rep_sch atg_20_dem_fin sos_20_rep_ash 
      1.009841       1.008009       1.011305       1.009734       1.012051       1.010966       1.009091       1.010426 
sos_20_dem_fal         arv_16         adv_16         arv_18         adv_18         arv_20         adv_20  county_splits 
      1.008248       1.012360       1.010968       1.013098       1.009841       1.009935       1.009855       1.001761 
   muni_splits            ndv            nrv        ndshare          e_dvs         pr_dem          e_dem          pbias 
      1.007331       1.008315       1.011801       1.011479       1.011471       1.000000       1.000457       1.006232 
          egap 
      1.000828 

Sampling diagnostics for SMC run 1 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,981 (59.6%)     12.3%        0.48 3,197 (101%)     13 
Split 2     2,528 (50.6%)     18.1%        0.52 2,644 ( 84%)      8 
Split 3       867 (17.3%)     20.2%        0.56 2,668 ( 84%)      6 
Split 4     1,907 (38.1%)     25.0%        0.58 2,571 ( 81%)      4 
Split 5     2,237 (44.7%)     26.5%        0.54 2,674 ( 85%)      3 
Split 6     1,753 (35.1%)     20.2%        0.55 2,649 ( 84%)      3 
Split 7     2,902 (58.0%)      9.2%        0.57 2,306 ( 73%)      2 
Resample    2,697 (53.9%)       NA%        0.67 2,774 ( 88%)     NA 

Sampling diagnostics for SMC run 2 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,975 (59.5%)     14.7%        0.48 3,151 (100%)     11 
Split 2     2,222 (44.4%)     20.0%        0.53 2,707 ( 86%)      7 
Split 3       914 (18.3%)     28.6%        0.56 2,649 ( 84%)      4 
Split 4     2,386 (47.7%)     28.8%        0.53 2,519 ( 80%)      3 
Split 5     2,548 (51.0%)     15.5%        0.53 2,761 ( 87%)      6 
Split 6     2,330 (46.6%)     17.1%        0.57 2,673 ( 85%)      4 
Split 7     2,625 (52.5%)      7.2%        0.56 2,392 ( 76%)      3 
Resample    2,410 (48.2%)       NA%        0.70 2,744 ( 87%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3
or so), and low numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.

Additional notes

image

christopherkenny commented 2 years ago

Redistricting requirements

In Missouri, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We apply a basic county constraint to be in line with the splits in the plan, though there is no legal requirement. We add a VRA constraint targeting one BVAP opportunity district.

Data Sources

Data for Missouri comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 10,000 districting plans for Missouri across two independent runs of the SMC algorithm, subset to plans with at least 30% BVAP in the most Black district, and then thin the sample to down to 5,000 plans. The subsetting by BVAP removes around 2% of sampled plans. We use a standard algorithmic county constraint. No special techniques were needed to produce the sample.

Validation

validation_20220621_2304

Summary

SMC: 5,000 sampled plans of 8 districts on 4,604 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.95
`est_label_mult`=1 • `pop_temper`=0

Plan diversity 80% range: 0.45 to 0.73

R-hat values for summary statistics:
     vap_black    pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp 
      1.003587       1.016695       1.012983       1.000903       1.017739       1.012518       1.022481 
     pop_white      pop_black       pop_aian      pop_asian       pop_nhpi      pop_other        pop_two 
      1.015122       1.005551       1.004617       1.006272       1.007635       1.002684       1.011791 
      vap_hisp      vap_white       vap_aian      vap_asian       vap_nhpi      vap_other        vap_two 
      1.009834       1.021428       1.008917       1.004242       1.008639       1.001351       1.016830 
pre_16_rep_tru pre_16_dem_cli uss_16_rep_blu uss_16_dem_kan gov_16_rep_gre gov_16_dem_kos atg_16_rep_haw 
      1.021180       1.004011       1.042427       1.003088       1.027639       1.002789       1.040899 
atg_16_dem_hen sos_16_rep_ash sos_16_dem_smi uss_18_rep_haw uss_18_dem_mcc pre_20_rep_tru pre_20_dem_bid 
      1.002163       1.048891       1.002186       1.039801       1.004271       1.020286       1.003561 
gov_20_rep_par gov_20_dem_gal atg_20_rep_sch atg_20_dem_fin sos_20_rep_ash sos_20_dem_fal         arv_16 
      1.032736       1.003141       1.048776       1.003026       1.048498       1.003306       1.037797 
        adv_16         arv_18         adv_18         arv_20         adv_20  county_splits    muni_splits 
      1.002967       1.039801       1.004271       1.040133       1.003197       1.004967       1.015247 
           ndv            nrv        ndshare          e_dvs          e_dem          pbias           egap 
      1.003244       1.036002       1.021201       1.021707       1.008580       1.038742       1.005395 

Sampling diagnostics for SMC run 1 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,948 (59.0%)     12.4%        0.47 3,183 (101%)     13 
Split 2     1,959 (39.2%)     17.8%        0.53 2,680 ( 85%)      8 
Split 3     1,484 (29.7%)     11.3%        0.57 2,622 ( 83%)     11 
Split 4     1,927 (38.5%)     15.4%        0.57 2,527 ( 80%)      7 
Split 5     2,122 (42.4%)     11.8%        0.54 2,636 ( 83%)      8 
Split 6     2,520 (50.4%)     13.9%        0.56 2,602 ( 82%)      5 
Split 7     1,885 (37.7%)      7.2%        0.56 2,366 ( 75%)      3 
Resample    1,629 (32.6%)       NA%        0.72 2,656 ( 84%)     NA 

Sampling diagnostics for SMC run 2 of 2 (5,000 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     3,002 (60.0%)     14.7%        0.47 3,167 (100%)     11 
Split 2     2,117 (42.3%)     22.7%        0.52 2,703 ( 86%)      6 
Split 3     1,756 (35.1%)     17.8%        0.55 2,697 ( 85%)      7 
Split 4     2,101 (42.0%)     21.0%        0.55 2,630 ( 83%)      5 
Split 5     2,043 (40.9%)     22.0%        0.58 2,628 ( 83%)      4 
Split 6     2,330 (46.6%)     12.4%        0.57 2,639 ( 83%)      6 
Split 7     3,029 (60.6%)      6.0%        0.55 2,451 ( 78%)      4 
Resample    2,842 (56.8%)       NA%        0.67 2,816 ( 89%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log
weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be
between 1 and 1.05

Other Notes

image

> plans %>% subset_sampled() %>% group_by(draw, chain) %>%
+     summarize(mmds = sum(vap_black/total_vap > 0.3 & ndshare > 0.5), .groups = 'drop') %>%
+     count(mmds)
  mmds    n
1    1 5000

@CoryMcCartan 10th time is the charm