alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

Re-run 2020 North Carolina Congressional Districts #125

Closed mzwu closed 2 years ago

mzwu commented 2 years ago

Redistricting requirements

In North Carolina, under North Carolina State Constitution Article II Sections 3 & 5, districts must:

  1. be contiguous
  2. have equal populations
  3. be geographically compact
  4. preserve county boundaries as much as possible

Interpretation of requirements

We enforce a maximum population deviation of 0.5%. We add a hinge Gibbs constraint targeting the same number of majority-minority districts as the enacted plan. We also apply a hinge Gibbs constraint to discourage packing of minority voters.

Data Sources

Data for North Carolina comes from the ALARM Project's 2020 Redistricting Data Files. Data for the 2022 North Carolina ratified congressional map comes from the North Carolina General Assembly.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 20,000 districting plans for North Carolina across two independent runs of the SMC algorithm, and then thin the sample to down to 5,000 plans.

Validation

image

SMC: 20,000 sampled plans of 14 districts on 2,666 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5
`est_label_mult`=1 • `pop_temper`=0.01

Plan diversity 80% range: 0.69 to 0.87

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby       pop_hisp      pop_white      pop_black       pop_aian 
      1.011740       1.004010       1.016022       1.006402       1.007232       1.011292       1.001015       1.008898       1.000115 
     pop_asian       pop_nhpi      pop_other        pop_two       vap_hisp      vap_white      vap_black       vap_aian      vap_asian 
      1.027659       1.001441       1.040853       1.000344       1.011211       1.001468       1.005998       1.000307       1.030569 
      vap_nhpi      vap_other        vap_two pre_16_rep_tru pre_16_dem_cli uss_16_rep_bur uss_16_dem_ros gov_16_rep_mcc gov_16_dem_coo 
      1.006742       1.039742       1.003739       1.014649       1.041666       1.011259       1.037633       1.011370       1.036032 
atg_16_rep_new atg_16_dem_ste sos_16_rep_lap sos_16_dem_mar pre_20_rep_tru pre_20_dem_bid uss_20_rep_til uss_20_dem_cun gov_20_rep_for 
      1.012634       1.036626       1.010377       1.033683       1.012790       1.048726       1.009076       1.047310       1.013620 
gov_20_dem_coo atg_20_rep_one atg_20_dem_ste sos_20_rep_syk sos_20_dem_mar         arv_16         adv_16         arv_20         adv_20 
      1.046216       1.009898       1.044914       1.009182       1.044455       1.010635       1.036324       1.010868       1.046275 
 county_splits    muni_splits            ndv            nrv        ndshare          e_dvs         pr_dem          e_dem          pbias 
      1.004991       1.014032       1.043048       1.012762       1.028974       1.028886       1.044242       1.006234       1.006050 
          egap 
      1.004837 

Sampling diagnostics for SMC run 1 of 2
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     8,041 (80.4%)     11.5%        0.43 6,354 (101%)     12 
Split 2     7,829 (78.3%)     17.7%        0.54 5,974 ( 95%)      7 
Split 3     7,698 (77.0%)     22.3%        0.61 5,852 ( 93%)      5 
Split 4     7,560 (75.6%)     31.0%        0.65 5,867 ( 93%)      3 
Split 5     7,423 (74.2%)     18.4%        0.68 5,671 ( 90%)      5 
Split 6     7,445 (74.5%)     20.7%        0.70 5,740 ( 91%)      4 
Split 7     7,552 (75.5%)     19.2%        0.72 5,709 ( 90%)      4 
Split 8     7,442 (74.4%)     23.1%        0.71 5,675 ( 90%)      3 
Split 9     7,390 (73.9%)     21.1%        0.73 5,696 ( 90%)      3 
Split 10    7,515 (75.2%)     14.3%        0.73 5,664 ( 90%)      4 
Split 11    7,669 (76.7%)     15.5%        0.72 5,482 ( 87%)      3 
Split 12    7,820 (78.2%)     11.8%        0.70 5,334 ( 84%)      3 
Split 13    7,423 (74.2%)      4.0%        0.77 4,850 ( 77%)      3 
Resample    2,891 (28.9%)       NA%        1.21 4,691 ( 74%)     NA 

Sampling diagnostics for SMC run 2 of 2
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     8,087 (80.9%)     15.5%        0.43 6,267 ( 99%)      9 
Split 2     7,872 (78.7%)     20.2%        0.54 5,938 ( 94%)      6 
Split 3     7,650 (76.5%)     21.9%        0.60 5,908 ( 93%)      5 
Split 4     7,565 (75.7%)     24.5%        0.65 5,812 ( 92%)      4 
Split 5     7,380 (73.8%)     18.5%        0.68 5,786 ( 92%)      5 
Split 6     7,431 (74.3%)     27.1%        0.71 5,693 ( 90%)      3 
Split 7     7,455 (74.6%)     19.4%        0.73 5,683 ( 90%)      4 
Split 8     7,509 (75.1%)     23.0%        0.73 5,720 ( 90%)      3 
Split 9     7,523 (75.2%)     12.8%        0.71 5,734 ( 91%)      5 
Split 10    7,509 (75.1%)     13.9%        0.70 5,664 ( 90%)      4 
Split 11    7,403 (74.0%)      9.4%        0.74 5,542 ( 88%)      5 
Split 12    7,589 (75.9%)     11.9%        0.73 5,351 ( 85%)      3 
Split 13    7,744 (77.4%)      4.1%        0.72 4,926 ( 78%)      3 
Resample    3,593 (35.9%)       NA%        1.15 4,889 ( 77%)     NA

Checklist

Additional Notes

Performance plot for BVAP. image

@kuriwaki @christopherkenny

mzwu commented 2 years ago

Percentage of Democratic seats by BVAP rank:

   bvap_rank    dem
       <dbl>  <dbl>
 1         1 0     
 2         2 0.0032
 3         3 0.0982
 4         4 0.152 
 5         5 0.231 
 6         6 0.308 
 7         7 0.349 
 8         8 0.409 
 9         9 0.688 
10        10 0.746 
11        11 0.749 
12        12 0.692 
13        13 0.910 
14        14 0.986

Total Black performant districts:

  n_black_perf    n
1            0  104
2            1  973
3            2 3438
4            3  482
5            4    3

So, 2.08% of the plans have no Democratic 30%+ BVAP districts.

kuriwaki commented 2 years ago

I was looking at your previous version of 2020 NC (#46) and the BVAP boxplots seem fairly similar, except there's more variaition.

Some notes:

  1. 2% with 0 performant district is not ideal but it might be ok to keep this mostly as-is after subsetting out those non-performant plans (as we did in SC)
  2. In the summary output, I've see effective sample sizes that are 300%+ . Is that ok?
  3. I saw your summary output was the Rhats of 5,000 thinned plans, but for checking convergence I think it's fine to summarize all the plans before thinning to 5000. If we do that can we make do with fewer nsims?

Otherwise I'll pass on to @christopherkenny

mzwu commented 2 years ago

In that case, I was able to increase the strengths of the hinge constraints with the sims reaching convergence. Now, when the plans are thinned, there are no plans that have 0 performant districts. I just updated the original PR comment with the new validation plots and performance plot.

Here are the extra stats - Percentage of Democratic seats by BVAP rank:

bvap_rank    dem
       <dbl>  <dbl>
 1         1 0     
 2         2 0.0026
 3         3 0.0794
 4         4 0.198 
 5         5 0.238 
 6         6 0.264 
 7         7 0.250 
 8         8 0.385 
 9         9 0.602 
10        10 0.788 
11        11 0.822 
12        12 0.801 
13        13 0.990 
14        14 1

Total Black performant districts:

n_black_perf    n
1            1  172
2            2 3879
3            3  939
4            4   10

Also, the effective sample sizes are not crazy high anymore. @kuriwaki @christopherkenny

kuriwaki commented 2 years ago

Great that there's no need for subsetting. Looks good to me.

christopherkenny commented 2 years ago

Sorry for the delay. Yeah, this looks good now. Thanks @mzwu