christopherkenny commented 3 months ago

Redistricting requirements

Per Hawaii Revised Statutes 25-2(b)(1)-(4) and (6), as in force for the 2010 cycle, districts must:

1. not unduly favor any person or party;

2. be contiguous, except when encompassing more than one island;

3. be compact;

4. where possible, follow geographical and recognized features and coincide with tract boundaries;

6. where practicable, avoid mixing regions with different socioeconomic interests.

Algorithmic Constraints

We enforce a maximum population deviation of 0.5%. We use Census tracts in line with 25-2(b)(4). In absence of regional knowledge about features and socioeconomic interests, we use municipalities to attempt to enforce 25-2(b)(4) and (6).

Data Sources

Data for Hawaii comes from the ALARM Project's 2010 Redistricting Data Files.

Pre-processing Notes

Islands are manually connected in the adjacency graph, but this has no bearing on the simulation.

Simulation Notes

We sample 5,000 districting plans for Hawaii over 2 independent runs of the SMC algorithm. We use partial SMC to draw one district in the contiguous portion of Honolulu County and assign the remainder to district 2.

Validation

validation_20240515_2244

SMC: 5,000 sampled plans of 2 districts on 344 units
`adapt_k_thresh`=0.99 • `seq_alpha`=0.5
`pop_temper`=0

Plan diversity 80% range: 0.23 to 0.67
✖ WARNING: Low plan diversity

R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white 
         1.000          1.000          1.001          1.000          1.000          1.000 
      pop_aian      pop_other       pop_nhpi        pop_two      pop_black       pop_hisp 
         1.000          1.002          1.001          1.000          1.000          1.000 
     pop_asian        vap_two      vap_other       vap_nhpi      vap_asian       vap_hisp 
         1.001          1.001          1.002          1.001          1.000          1.000 
     vap_black      vap_white       vap_aian pre_20_dem_bid pre_16_dem_cli gov_18_rep_tup 
         1.000          1.001          1.000          1.000          1.000          1.000 
uss_16_dem_sch uss_16_rep_car gov_18_dem_ige pre_20_rep_tru uss_18_rep_cur pre_16_rep_tru 
         1.000          1.000          1.000          1.000          1.000          1.000 
uss_18_dem_hir         arv_20         adv_16         adv_18         arv_18         adv_20 
         1.000          1.000          1.000          1.000          1.000          1.000 
        arv_16    muni_splits            ndv            nrv        ndshare          e_dvs 
         1.000          1.001          1.000          1.000          1.000          1.000 
          egap 
         1.000 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,460 (98.4%)      9.6%        0.25 1,556 ( 98%)      5 
Resample    2,344 (93.8%)       NA%        0.25 2,242 (142%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,462 (98.5%)      8.4%        0.25 1,597 (101%)      6 
Resample    2,351 (94.0%)       NA%        0.25 2,224 (141%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std.
devs. of the log weights (more than 3 or so), and low numbers of unique plans. R-hat values for
summary statistics should be between 1 and 1.05.
• Low diversity: Check for potential bottlenecks. Increase the number of samples. Examine the
diversity plot with `hist(plans_diversity(plans), breaks=24)`. Consider weakening or removing
constraints, or increasing the population tolerance. If the acceptance rate drops quickly in the
final splits, try increasing `pop_temper` by 0.01.

Checklist

[x] I have followed the instructions
[x] I have updated the tracker
[x] All TODO lines from the template code have been removed
[x] I have merged in the main branch and then recalculated summary statistics
[x] I have run enforce_style() to format my code
[x] The documentation copied above is up-to-date
[x] There are no data files in this pull request
[x] None of the file output paths (for the redist_map and redist_plans objects, and summary statistics) have been edited

@CoryMcCartan

Additional notes

Low # of districts and precincts + keeping main island together results in low diversity

CoryMcCartan commented 3 months ago

The link to 2010 guidelines is broken. Let's cite the statute directly: https://law.justia.com/codes/hawaii/2011/division1/title3/chapter25/25-2/

In 2011 the guideline was to use tracts

christopherkenny commented 1 month ago

2010 Hawaii Congressional Districts

Redistricting requirements

Per Hawaii Revised Statutes 25-2(b)(1)-(4) and (6), as in force for the 2010 cycle, districts must:

1. not unduly favor any person or party;

2. be contiguous, except when encompassing more than one island;

3. be compact;

4. where possible, follow geographical and recognized features and coincide with tract boundaries;

6. where practicable, avoid mixing regions with different socioeconomic interests.

Algorithmic Constraints

We enforce a maximum population deviation of 0.5%. We use Census tracts in line with 25-2(b)(4). In absence of regional knowledge about features and socioeconomic interests, we use municipalities to attempt to enforce 25-2(b)(4) and (6).

Data Sources

Data for Hawaii comes from the ALARM Project's 2010 Redistricting Data Files.

Pre-processing Notes

Islands are manually connected in the adjacency graph, but this has no bearing on the simulation.

Simulation Notes

We sample 5,000 districting plans for Hawaii over 2 independent runs of the SMC algorithm. We use partial SMC to draw one district in the contiguous portion of Honolulu County and assign the remainder to district 2.

Validation

validation_20240714_1224

SMC: 5,000 sampled plans of 2 districts on 351 units
`adapt_k_thresh`=0.99 • `seq_alpha`=0.5
`pop_temper`=0           
Plan diversity 80% range: 0.093 to 0.674
✖ WARNING: Low plan diversity
R-hat values for summary statistics:
   pop_overlap      total_vap       plan_dev      comp_edge    comp_polsby      pop_white        pop_two      pop_black 
         1.000          1.000          1.000          1.000          1.000          1.000          1.000          1.000 
      pop_hisp       pop_aian      pop_other       pop_nhpi      pop_asian        vap_two      vap_other       vap_nhpi 
         1.000          1.000          1.000          1.000          1.000          1.000          1.000          1.000 
     vap_asian       vap_hisp      vap_black      vap_white       vap_aian pre_20_rep_tru pre_20_dem_bid pre_16_dem_cli 
         1.000          1.000          1.000          1.000          1.000          1.000          1.000          1.000 
gov_18_rep_tup uss_16_dem_sch uss_16_rep_car gov_18_dem_ige uss_18_rep_cur pre_16_rep_tru uss_18_dem_hir         arv_20 
         1.000          1.000          1.000          1.000          1.000          1.000          1.000          1.000 
        adv_16         adv_18         arv_18         adv_20         arv_16    muni_splits            ndv            nrv 
         1.000          1.000          1.001          1.000          1.000          1.000          1.000          1.001 
       ndshare          e_dvs           egap 
         1.000          1.000          1.001 

Sampling diagnostics for SMC run 1 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,451 (98.1%)     12.3%        0.29 1,591 (101%)      5 
Resample    2,322 (92.9%)       NA%        0.29 2,222 (141%)     NA 

Sampling diagnostics for SMC run 2 of 2 (2,500 samples)
         Eff. samples (%) Acc. rate Log wgt. sd  Max. unique Est. k 
Split 1     2,451 (98.0%)     10.3%        0.29 1,567 ( 99%)      6 
Resample    2,322 (92.9%)       NA%        0.29 2,194 (139%)     NA 

•  Watch out for low effective samples, very low acceptance rates (less than 1%), large std. devs. of the log weights (more than 3
or so), and low numbers of unique plans. R-hat values for summary statistics should be between 1 and 1.05.
• Low diversity: Check for potential bottlenecks. Increase the number of samples. Examine the diversity plot with
`hist(plans_diversity(plans), breaks=24)`. Consider weakening or removing constraints, or increasing the population tolerance. If
the acceptance rate drops quickly in the final splits, try increasing `pop_temper` by 0.01.

alarm-redist / fifty-states

2010 Hawaii Congressional Districts (use VTD not tract) #191

Redistricting requirements

Algorithmic Constraints

Data Sources

Pre-processing Notes

Simulation Notes

Validation

Checklist

Additional notes

2010 Hawaii Congressional Districts

Redistricting requirements

Algorithmic Constraints

Data Sources

Pre-processing Notes

Simulation Notes

Validation