NYCPlanning / db-acs

American Community Survey data processing for Population Fact Finder
4 stars 1 forks source link

Y2006-2010 row count doesn't match Y2014-2018 #26

Open allthesignals opened 4 years ago

allthesignals commented 4 years ago

~In PFF production, the row count for Y2006-2010 demographic data is 380,411:~

~select count(dataset) from demographic where dataset = 'Y2006-2010';~

~However, in EDM's database, the count for Y2006-2010 demographic is 314,990.~

~It seems like there are some missing variables in this output, and I'm not sure why.~

Still seeing discrepancies in housing and social

allthesignals commented 4 years ago

See this thread for a little more context: https://github.com/NYCPlanning/labs-factfinder-api/issues/82

allthesignals commented 4 years ago

We can't re-use production data because 2006-2010 apparently must be re-computed to adjust for inflation.

We should understand why this isn't handled by the app...

allthesignals commented 4 years ago

Single geography study areas don't get run through the processing steps, so they don't get inflated:

the reason we did it that way was so the app wasn't touching any single geographies (they were being pulled directly from the database), only when there is an recalculation for an aggregation

allthesignals commented 4 years ago

@SPTKL fixed this! The column counts match now. Closing.

allthesignals commented 4 years ago

I'm sorry, I closed this prematurely. I should've checked the other profiles.

There still a few remaining discrepancies in housing and social:

Housing: 6-10: 407,064 14-18: 409,487

Social: 6-10: 1,330,227 14-18: 1,332,650

@SPTKL any ideas? You mentioned the metadata we have is bad. Anything I can do to help?