Closed tylersimko closed 2 years ago
Summary for 5,000 final sampled:
SMC: 5,000 sampled plans of 38 districts on 9,007 units
`adapt_k_thresh`=0.985 • `seq_alpha`=0.95
`est_label_mult`=1 • `pop_temper`=0.03
Plan diversity 80% range: 0.79 to 0.90
R-hat values for summary statistics:
pop_overlap total_vap total_cvap plan_dev
1.013201 1.028427 1.018327 1.007102
comp_edge comp_polsby pop_hisp pop_white
1.001882 1.009034 1.005172 1.016104
pop_black pop_aian pop_asian pop_nhpi
1.025503 1.002803 1.004963 1.001971
pop_other pop_two vap_hisp vap_white
1.001015 1.004194 1.007224 1.016347
vap_black vap_aian vap_asian vap_nhpi
1.021574 1.001649 1.006905 1.018076
vap_other vap_two cvap_white cvap_black
1.003823 1.003785 1.001303 1.021426
cvap_hisp cvap_asian cvap_aian cvap_nhpi
1.009156 1.023701 1.001795 1.005509
cvap_two cvap_other pre_16_rep_tru pre_16_dem_cli
1.000110 1.014984 1.009103 1.010923
uss_18_rep_cru uss_18_dem_oro gov_18_rep_abb gov_18_dem_val
1.014252 1.051739 1.017069 1.037330
atg_18_rep_pax atg_18_dem_nel pre_20_rep_tru pre_20_dem_bid
1.015387 1.045510 1.018862 1.028955
uss_20_rep_cor uss_20_dem_heg arv_16 adv_16
1.022019 1.028307 1.009103 1.010923
arv_18 adv_18 arv_20 adv_20
1.015017 1.053836 1.019445 1.029503
county_splits muni_splits ndv nrv
1.024588 1.013475 1.039720 1.016543
ndshare e_dvs e_dem pbias
1.009798 1.010090 1.021942 1.026753
egap
1.024320
✖ WARNING: SMC runs have not converged.
Sampling diagnostics for SMC run 1 of 2 (25,000 samples)
Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k
Split 1 13,837 (55.3%) 23.3% 0.55 15,809 (100%) 7
Split 2 12,745 (51.0%) 28.2% 0.64 13,598 ( 86%) 5
Split 3 11,201 (44.8%) 18.9% 0.65 13,138 ( 83%) 7
Split 4 10,533 (42.1%) 11.1% 0.68 13,032 ( 82%) 11
Split 5 9,340 (37.4%) 15.7% 0.69 12,722 ( 81%) 7
Split 6 7,665 (30.7%) 16.7% 0.71 12,296 ( 78%) 6
Split 7 7,968 (31.9%) 21.5% 0.71 11,792 ( 75%) 4
Split 8 8,271 (33.1%) 23.7% 0.72 11,767 ( 74%) 3
Split 9 7,850 (31.4%) 26.3% 0.72 11,580 ( 73%) 2
Split 10 7,846 (31.4%) 17.0% 0.73 11,149 ( 71%) 3
Split 11 7,281 (29.1%) 11.5% 0.78 10,520 ( 67%) 4
Split 12 7,058 (28.2%) 10.1% 0.80 10,197 ( 65%) 3
Split 13 4,406 (17.6%) 3.2% 0.73 9,618 ( 61%) 2
Resample 2,961 (11.8%) NA% 1.44 8,915 ( 56%) NA
Sampling diagnostics for SMC run 2 of 2 (25,000 samples)
Eff. samples (%) Acc. rate Log wgt. sd Max. unique Est. k
Split 1 14,235 (56.9%) 16.6% 0.55 15,810 (100%) 10
Split 2 12,875 (51.5%) 23.9% 0.64 13,538 ( 86%) 6
Split 3 11,304 (45.2%) 30.2% 0.65 13,227 ( 84%) 4
Split 4 11,586 (46.3%) 34.0% 0.67 12,938 ( 82%) 3
Split 5 10,172 (40.7%) 15.7% 0.67 12,753 ( 81%) 7
Split 6 8,921 (35.7%) 23.6% 0.70 12,437 ( 79%) 4
Split 7 8,191 (32.8%) 21.5% 0.71 11,894 ( 75%) 4
Split 8 7,489 (30.0%) 24.2% 0.73 11,616 ( 74%) 3
Split 9 7,201 (28.8%) 26.7% 0.72 11,422 ( 72%) 2
Split 10 7,459 (29.8%) 8.2% 0.73 11,060 ( 70%) 7
Split 11 7,404 (29.6%) 12.6% 0.78 10,219 ( 65%) 4
Split 12 7,128 (28.5%) 9.1% 0.74 10,889 ( 69%) 3
Split 13 4,214 (16.9%) 2.0% 0.73 8,795 ( 56%) 4
Resample 3,413 (13.7%) NA% 1.50 9,242 ( 58%) NA
• Watch out for low effective samples, very low acceptance rates (less
than 1%), large std. devs. of the log weights (more than 3 or so), and
low numbers of unique plans. R-hat values for summary statistics should
be between 1 and 1.05.
• SMC convergence: Increase the number of samples. If you are
experiencing low plan diversity or bottlenecks as well, address those
issues first.
Other than this one thing, this looks great to me!
Added @CoryMcCartan thanks!
Thanks. @christopherkenny if this is good to you we can merge!
Amazing -- I just double-checked everything is 5k, so @christopherkenny just let me know whenever and I'll run to finalize.
Go for it! Great work @tylersimko!
Redistricting requirements
In Texas, districts must meet US constitutional requirements, but there are no state-specific statutes.
Interpretation of requirements
We enforce a maximum population deviation of 0.5%.
Data Sources
Data for Texas comes from the ALARM Project's 2020 Redistricting Data Files.
Pre-processing Notes
We estimate CVAP populations with the
cvap
R package. We also pre-process the map to split it into clusters for simulation, which has a slight effect on the types of district plans that will be sampled.Simulation Notes
We sample 50,000 districting plans for Texas across two independent runs of the SMC algorithm. Due to the size and complexity of Texas, we split the simulations into multiple steps.
1. Clustering procedure
First, we run simulations in three major metropolitan areas: Greater Houston, a combination of Greater San Antonio and Austin, and Dallas-Fort Worth. We use collections of counties that define the Metropolitan Statistical Areas. The counties in each cluster are those in each Census MSA:
Houston–The Woodlands–Sugar Land: Austin, Brazoria, Chambers, Fort Bend, Galveston, Harris, Liberty, Montgomery, Waller.
Austin–Round Rock-Georgetown: Bastrop, Caldwell, Hays, Travis, Williamson.
San Antonio–New Braunfels: Atascosa, Bandera, Bexar, Comal, Guadalupe, Kendall, Medina, Wilson.
Dallas–Fort Worth–Arlington: Collin, Dallas, Denton, Ellis, Hunt, Kaufman, Rockwall, Johnson, Parker, Tarrant, Wise.
These simulations run the SMC algorithm within each cluster with a 0.25% population tolerance. Because each cluster will have leftover population, we apply an additional constraint that incentivizes leaving any unassigned areas on the edge of these clusters to avoid discontiguities.
In each cluster, we apply hinge Gibbs constraints of strength 3 to encourage the formation of Hispanic CVAP opportunity districts. In Houston, we also apply a hinge Gibbs constraint of strength 3 to encourage the formation of Black CVAP opportunity districts. These districts nudge the formation of opportunity districts are above 35%, and penalize districts with minority populations above 70%.
2. Combination procedure
Then, these partial map simulations are combined to run statewide simulations. We again apply Gibbs hing constraints to encourage the formation of minority opportunity districts.
Validation
Checklist
TODO
lines from the template code have been removedenforce_style()
to format my coderedist_map
andredist_plans
objects, and summary statistics) have been editeddelete this line and all the tags except the reviewers you need @CoryMcCartan @christopherkenny