NYCPlanning / db-factfinder

data ETL for population fact finder (decennial + acs)
https://nycplanning.github.io/db-factfinder/factfinder/
MIT License
2 stars 3 forks source link

Problem with "Speak English less than "very well"" variable metadata for 2021 ACS #258

Closed TylerMatteo closed 1 year ago

TylerMatteo commented 1 year ago

I noticed one more issue in our testing of the 2021 ACS data in PFF where the data for "Speak English less that "very well"" for total population aged 5 years and over is showing up blank in the app. To repro, see the staging site here and look for the first row with that heading in the table).

I'm pretty sure this is because of changes to the domain and base_variable field in the 2021 ACS metadata JSON versus the 2020 data. In the 2020 metadata, the variable lgoenlep1 has a domain of "social" and a base_variable of "pop5pl1". In 2021, it has "community_profiles" and "lgbase", respectively. Our code doesn't have any references to "lgbase" that I can find so that may might be a net-new variable that we would have to make changes for. I think all of this is leading to data not mapping correctly downstream in PFF.

I could open a PR to change the domain back to social and the base_variable back to "pop5pl1" but I don't think we should do that without consulting Population because those may have different numbers in the new data. Also, this may be a good time to step back and make sure there aren't any other unintended updates to the 2021 metadata vs 2020.

damonmcc commented 1 year ago

gonna email Population and give them the results of a programatic comparison I ran between the current and old metadata json files. I'm sending them an excel file, but also attaching it and the source csv file here in case it's useful

metadata_compare_result.csv meta2017-2021_pluscategory_compared.xlsx