ihmeuw / vivarium_census_prl_synth_pop

US Census Probabilistic Record Linkage synthetic population generation
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

use complete state categories #280

Closed stevebachmeier closed 1 year ago

stevebachmeier commented 1 year ago

Title: Use complete state categories

Description

Changes and notes

This was found while working on MIC-4000 (column-level integration testing). We are currently setting the state column categorical dtype categories as whatever states exist in the dataset. This instead uses all states from metadata regardless of whether they exist in the dataset. Further, for the sample data, it adds that fake state "US" to the categories.

Verification and Testing

Re-made sample data and then got pseudopeople pytests to pass on my MIC-4000 branch.