NYCPlanning / db-factfinder

data ETL for population fact finder (decennial + acs)
https://nycplanning.github.io/db-factfinder/factfinder/
MIT License
2 stars 3 forks source link

Remove ACS Variables from 2010 that don't match metadata file #244

Closed abrieff closed 2 years ago

abrieff commented 2 years ago

Addresses #243 This reverts commit ef50b5bbdda619bdf0be5884f1b5f981bb753021.

When we removed the base variable join from the script, we removed any respect for the metadata file for 2010, and just piped through the input excel file. The excel file has columns for variables not in the metadata file, causing issues for OSE. Adds an inner join on the metadata file so we can only return records with a matching metadata.

Needs 2 Reviewers

mbh329 commented 2 years ago

This looks good but honestly would like another set of eyes @td928 @Oysters1874 @AmandaDoyle

abrieff commented 2 years ago

I think the key thing that makes me a little nervous is just the dependence on the metadata file having every relevant field. But from what I remember about our conversations last time around, that's expected to be the case.

mbh329 commented 2 years ago

That makes me nervous as well but this whole data update was pretty wonky

abrieff commented 2 years ago

The fact that OSE relies on the metadata files in the functioning of their app should provide some confidence there - they made a separate ticket because one base variable was missing for a pff variable - this would be the case for any variables included that don't have a matching metadata record