cfpb / hmda-platform

The HMDA Submission backend applications.
Creative Commons Zero v1.0 Universal
102 stars 93 forks source link

Census Flat File Value Replacement #4891

Closed PatrickGoRaft closed 1 month ago

PatrickGoRaft commented 2 months ago

There are some incorrect entries found in the source census flat file that need to be corrected in our code that derives the needed fields for the HMDA Platform.

Example: fiec_census_df.loc[(ffiec_census_df['MSA/MD'] == "99999"), 'MSA/MD Name'] = ""

Example: ffiec_census_df.loc[(ffiec_census_df['Median Age'] == "2002"), 'Median Age'] = "6"

tptignor commented 2 months ago

Associated work: https://github.com/cfpb/hmda-platform/pull/4875 https://github.com/cfpb/hmda-platform/pull/4894

tptignor commented 1 month ago

Reopening after additional considerations at today's data standup.

tptignor commented 1 month ago

Discussed with @Kibrael and Jonathan B. Our understanding is that updates to the housing age data are provided in different files published by the US Census Bureau in years ending with 2, 7 and 0. Housing age data therefore has a built-in error range of 0-5 years. The Census Bureau guidance is that ACS summary file "jam values" of "18" for MEDIAN YEAR STRUCTURE BUILT (accompanied by an erroneous "2002" value for median house age) are identical to "2014+" which in the year 2020 indicates a house age of 6 years. Given this, our plan is to maintain a fixed 2002->6 mapping for MedianAge and expect corrections to arrive with future Census Bureau source data publications.