This was found while working on MIC-4000 (column-level integration testing).
We are currently setting the state column categorical dtype categories as
whatever states exist in the dataset. This instead uses all states from
metadata regardless of whether they exist in the dataset. Further,
for the sample data, it adds that fake state "US" to the categories.
Verification and Testing
Re-made sample data and then got pseudopeople pytests to pass
on my MIC-4000 branch.
Title: Use complete state categories
Description
Changes and notes
This was found while working on MIC-4000 (column-level integration testing). We are currently setting the state column categorical dtype categories as whatever states exist in the dataset. This instead uses all states from metadata regardless of whether they exist in the dataset. Further, for the sample data, it adds that fake state "US" to the categories.
Verification and Testing
Re-made sample data and then got pseudopeople pytests to pass on my MIC-4000 branch.