alarm-redist / fifty-states

Redistricting analysis for all 50 U.S. states
https://alarm-redist.github.io/fifty-states/
Other
9 stars 7 forks source link

2020 Georgia Congressional Districts #94

Closed kevpwang closed 2 years ago

kevpwang commented 2 years ago

Redistricting requirements

In Georgia, districts must, under the 2021-22 Guidelines for the House Legislative and Congressional Reapportionment Committee:

  1. be contiguous
  2. have equal populations
  3. be geographically compact
  4. preserve county and municipality boundaries as much as possible
  5. avoid the unnecessary pairing of incumbents

Interpretation of requirements

We enforce a maximum population deviation of 0.5%.

Data Sources

Data for Georgia comes from the ALARM Project's 2020 Redistricting Data Files.

Pre-processing Notes

No manual pre-processing decisions were necessary.

Simulation Notes

We sample 20,000 districting plans for Georgia. To balance county and municipality splits, we create pseudocounties for use in the county constraint, which leads to fewer municipality splits than using a county constraint. Note that Cobb, Fulton, and Gwinnett Counties must be split due to their large populations, although within each of these counties, we avoid splitting any municipality. We apply a hinge Gibbs constraint of strength 20 to encourage drawing the same number of majority-Black districts as the enacted plan, focusing on districts with relatively higher proportions of Black voters. We also apply a hinge Gibbs constraint of strength 10 to discourage packing of Black voters.

Validation

validation_20220617_2318

SMC: 20,000 sampled plans of 14 districts on 2,698 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.5      
`est_label_mult`=1 • `pop_temper`=0.01        
ℹ Computing summary statistics for GA_cd_2020
Plan diversity 80% range: 0.62 to 0.83        
ℹ Computing summary statistics for GA_cd_2020
R-hat values for summary statistics:
    pop_overlap       total_vap        plan_dev       comp_edge     comp_polsby        pop_hisp 
       1.000799        1.003039        1.004908        1.007065        1.022116        1.009669 
      pop_white       pop_black        pop_aian       pop_asian        pop_nhpi       pop_other 
       1.006999        1.014408        1.010030        1.014722        1.001162        1.004708 
        pop_two        vap_hisp       vap_white       vap_black        vap_aian       vap_asian 
       1.009260        1.012871        1.007489        1.014679        1.012604        1.013841 
       vap_nhpi       vap_other         vap_two  pre_16_rep_tru  pre_16_dem_cli  uss_16_rep_isa 
       1.001096        1.010891        1.008015        1.005688        1.016390        1.002402 
 uss_16_dem_bar  gov_18_rep_kem  gov_18_dem_abr  atg_18_rep_car  atg_18_dem_bai  sos_18_rep_raf 
       1.006777        1.001745        1.012205        1.002346        1.004412        1.001796 
 sos_18_dem_bar sos_r18_rep_raf sos_r18_dem_bar  pre_20_rep_tru  pre_20_dem_bid  uss_20_rep_per 
       1.007770        1.011450        1.000761        1.001534        1.015859        1.002111 
 uss_20_dem_oss          arv_16          adv_16          arv_18          adv_18          arv_20 
       1.011646        1.001944        1.004733        1.001989        1.007848        1.001819 
         adv_20   county_splits     muni_splits             ndv             nrv         ndshare 
       1.017033        1.013087        1.001263        1.010022        1.001786        1.002082 
          e_dvs          pr_dem           e_dem           pbias            egap 
       1.002116        1.002921        1.000007        1.000692        1.000568 

Sampling diagnostics for SMC run 1 of 2       
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     8,670 (86.7%)     13.9%        0.25 6,315 (100%)     13 
Split 2     8,453 (84.5%)     20.1%        0.47 5,980 ( 95%)      8 
Split 3     8,413 (84.1%)     28.5%        0.54 5,930 ( 94%)      5 
Split 4     8,307 (83.1%)     18.8%        0.57 5,882 ( 93%)      7 
Split 5     8,213 (82.1%)     19.6%        0.60 5,845 ( 92%)      6 
Split 6     8,069 (80.7%)     17.8%        0.63 5,855 ( 93%)      6 
Split 7     7,770 (77.7%)     23.1%        0.68 5,805 ( 92%)      4 
Split 8     7,571 (75.7%)     26.7%        0.71 5,764 ( 91%)      3 
Split 9     7,585 (75.8%)     23.0%        0.73 5,615 ( 89%)      3 
Split 10    7,492 (74.9%)     20.2%        0.71 5,654 ( 89%)      3 
Split 11    7,592 (75.9%)     23.1%        0.72 5,548 ( 88%)      2 
Split 12    7,647 (76.5%)     17.2%        0.68 5,405 ( 86%)      2 
Split 13    7,406 (74.1%)      6.2%        0.71 4,832 ( 76%)      2 
Resample    3,197 (32.0%)       NA%        1.19 4,661 ( 74%)     NA 

Sampling diagnostics for SMC run 2 of 2       
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     8,668 (86.7%)     19.8%        0.25 6,336 (100%)      9 
Split 2     8,433 (84.3%)     26.5%        0.47 5,986 ( 95%)      6 
Split 3     8,417 (84.2%)     34.9%        0.53 5,969 ( 94%)      4 
Split 4     8,291 (82.9%)     26.0%        0.57 5,885 ( 93%)      5 
Split 5     8,200 (82.0%)     28.4%        0.60 5,805 ( 92%)      4 
Split 6     7,931 (79.3%)     32.5%        0.66 5,846 ( 92%)      3 
Split 7     7,876 (78.8%)     29.2%        0.69 5,793 ( 92%)      3 
Split 8     7,649 (76.5%)     34.9%        0.71 5,745 ( 91%)      2 
Split 9     7,440 (74.4%)     30.6%        0.73 5,693 ( 90%)      2 
Split 10    7,636 (76.4%)     26.8%        0.73 5,632 ( 89%)      2 
Split 11    7,638 (76.4%)     23.1%        0.71 5,602 ( 89%)      2 
Split 12    7,529 (75.3%)     12.5%        0.70 5,364 ( 85%)      3 
Split 13    7,474 (74.7%)      6.2%        0.69 4,772 ( 75%)      2 
Resample    2,879 (28.8%)       NA%        1.11 4,737 ( 75%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log
weights (more than 3 or so), and low numbers of unique plans. R-hat values for summary statistics should be
between 1 and 1.05.

Checklist

@CoryMcCartan @kuriwaki

kevpwang commented 2 years ago

Performance plot:

performance

kevpwang commented 2 years ago

First time using the negative and inverse hinge constraints, so let me know whether I've described what they're doing correctly.

CoryMcCartan commented 2 years ago

This looks fantastic! Just need to remove those 2 commented-out constraints, and add the code (in the channel) to filter the 20,000 down to 5,000 total.

kevpwang commented 2 years ago

Thanks! Just made fixes.

CoryMcCartan commented 2 years ago

@kevpwang this is good to merge, right?

kevpwang commented 2 years ago

This is all set.

geoffw72 commented 2 months ago

@CoryMcCartan , thanks for pointing me to these 2020 Congressional state files. Unlike other states, the graph of minority VAP share by ordered districts for Georgia posted here doesn't seem to match your GA_cd_2020_stats.tab results. The GA_cd_2020_stats.tab stats for the enacted districts seem right (similar to https://projects.fivethirtyeight.com/redistricting-2022-maps/georgia/). Can you take another look at the minority VAP above? Maybe it's from a different enacted or proposed map? -Geoff