NYCPlanning / db-acs

American Community Survey data processing for Population Fact Finder
4 stars 1 forks source link

Social columns wildly off #13

Closed SPTKL closed 4 years ago

SPTKL commented 4 years ago

The following columns have wildly off values

C1864DVsn
C65plDAmb
C1864DCog
C65plDSCr
C1864DILD
C65plDCog
CU18DSCr
C1864DAmb
CU18DAmb
CU18DVsn
C65plDHrg
C65plDVsn
CU18DHrg
C1864DSCr
CU18DCog
C1864DHrg
C65plDILD

In terms of methodology, they are fine, potential explanations:

  1. field mapping is wrong
  2. require special calculation
EricaMaurer commented 4 years ago

what's off here?

SPTKL commented 4 years ago

social_mismatches.csv.zip all the numbers are wrong, and they are off by quite a bit, still investigating ..

SPTKL commented 4 years ago

e.g. C65plDVsn ==> S1810_C01_036 info = df.loc[df.geoid == '1024900',S1810_C01_036].to_dict('records') info = [{'S1810_C01_036E': 137.0}]

for most of these cases, we are not doing any calculation here but in population's calculation, we get C65plDVsnE = 5

EricaMaurer commented 4 years ago

The example you gave above should be an estimate of 5. It looks like the metadata sheet was incorrectly labeled so these were pulling totals rather than pop with a disability. Should be okay now. meta_2.xlsx