2020 Kansas Congressional Districts (Fixes cores)

christopherkenny commented 3 months ago

Redistricting requirements

In Kansas, according to the Proposed Guidelines and Criteria for 2022 Kansas Congressional Redistricting districts must:

be contiguous
have equal populations
be geographically compact
preserve county and municipality boundaries as much as possible
preserve the cores of existing districts
preserve communities of social, cultural, racial, ethnic, and economic interest to the extent possible

Algorithmic Constraints

We enforce a maximum population deviation of 0.5%. We add a county constraint.

Data Sources

Data for Kansas comes from the ALARM Project's 2020 Redistricting Data Files. Data for the 2022 Kansas enacted congressional map comes from the American Redistricting Project.

Pre-processing Notes

To preserve the cores of prior districts, we merge all precincts which are more than two precincts away from a district border, under the 2010 plan. Precincts in counties which are split by existing district boundaries are merged only within their county.

Simulation Notes

We sample 5,000 districting plans for Kansas across two independent runs of the SMC algorithm. No special techniques were needed to produce the sample.

Validation

validation_20240515_2143

SMC: 5,000 sampled plans of 4 districts on 4,240 units
`adapt_k_thresh`=0.99 • `seq_alpha`=0.7
`pop_temper`=0

Plan diversity 80% range: 0.15 to 0.66
✖ WARNING: Low plan diversity

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white 
         1.018          1.004          1.001          1.007          1.023          1.008 
     pop_asian        pop_two      pop_other       pop_nhpi       pop_aian       pop_hisp 
         1.030          1.023          1.025          1.000          1.015          1.012 
     pop_black        vap_two      vap_other       vap_nhpi       vap_hisp      vap_black 
         1.007          1.019          1.016          1.001          1.011          1.008 
     vap_asian       vap_aian      vap_white pre_16_dem_cli uss_16_dem_wie uss_20_dem_bol 
         1.023          1.016          1.009          1.021          1.016          1.023 
gov_18_rep_kob pre_16_rep_tru uss_16_rep_mor gov_18_dem_kel atg_18_rep_sch atg_18_dem_swa 
         1.016          1.019          1.015          1.022          1.015          1.023 
sos_18_rep_sch sos_18_dem_mcc pre_20_dem_bid pre_20_rep_tru uss_20_rep_mar         arv_16 
         1.017          1.024          1.024          1.015          1.015          1.016 
        adv_20         adv_16         arv_18         adv_18         arv_20  county_splits 
         1.024          1.010          1.016          1.024          1.016          1.000 
   muni_splits            ndv            nrv        ndshare          e_dvs          e_dem 
         1.002          1.023          1.015          1.031          1.030          1.013 
         pbias           egap 
         1.004          1.004 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,426 (97.1%)      3.8%        0.26 1,540 ( 97%)      9 
Split 2     2,260 (90.4%)      6.2%        0.43 1,483 ( 94%)      5 
Split 3     2,297 (91.9%)      0.6%        0.42   749 ( 47%)      3 
Resample    2,115 (84.6%)       NA%        0.42 2,078 (131%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,431 (97.2%)      2.9%        0.25 1,587 (100%)     12 
Split 2     1,957 (78.3%)      4.6%        0.47 1,499 ( 95%)      7 
Split 3     2,224 (89.0%)      0.6%        0.48   702 ( 44%)      4 
Resample    1,969 (78.8%)       NA%        0.48 2,008 (127%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std.
devs. of the log weights (more than 3 or so), and low numbers of unique plans. R-hat values for
summary statistics should be between 1 and 1.05.
• Low diversity: Check for potential bottlenecks. Increase the number of samples. Examine the
diversity plot with `hist(plans_diversity(plans), breaks=24)`. Consider weakening or removing
constraints, or increasing the population tolerance. If the acceptance rate drops quickly in the
final splits, try increasing `pop_temper` by 0.01.

Checklist

[x] I have followed the instructions
[x] I have updated the tracker
[x] All TODO lines from the template code have been removed
[x] I have merged in the main branch and then recalculated summary statistics
[x] I have run enforce_style() to format my code
[x] The documentation copied above is up-to-date
[x] There are no data files in this pull request
[x] None of the file output paths (for the redist_map and redist_plans objects, and summary statistics) have been edited

@CoryMcCartan

Additional notes

Diversity is low due to use of cores.

CoryMcCartan commented 3 months ago

For completeness can you attach a plot(map, core_id) and also a histogram of pop overlap with the old districts across the ensemble?

christopherkenny commented 3 months ago

Output of plot(map, core_id)

Pop overlap with old plan:

plans |> 
    match_numbers(map$cd_2010) |> 
    hist(pop_overlap)

CoryMcCartan commented 3 months ago

Great.

alarm-redist / fifty-states