NYCPlanning / db-equitable-development-tool

Data Repo for the equitable development tool (EDDT)
MIT License
0 stars 0 forks source link

2023 EDDE Update #312

Closed AmandaDoyle closed 1 year ago

AmandaDoyle commented 1 year ago

Goal: Provide updated EDDE data to OSE by the end of March. App: https://equitableexplorer.planning.nyc.gov/map/data/district 2023 input data: here

Process:

TODOs since we've reviewed new source data

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

damonmcc commented 1 year ago

reviewing source data in Sharepoint for completeness

mbh329 commented 1 year ago

I think we can rename the new source data to match the name of the indicator to help avoid unnecessary confusion between the source data name and the name of the ingestion script for that specific indicator i.e. NTA_data_prepared_for_ArcMap_wCodebook.xlsx -> education_outcome_source_data.xlsx

mbh329 commented 1 year ago

Notes on improvements:

damonmcc commented 1 year ago

update

damonmcc commented 1 year ago

exported categories to edm-publishing on dev branch. action runs:

damonmcc commented 1 year ago
mbh329 commented 1 year ago

Housing Security Outputs:

units_affordable (eli, vli, li, mi, midi, hi) for 2017 - 2021 not being populated in the housing security outputs

units_occupied_renter_1721 not populated but are in the PUMS data sent by winnie

The raw data can be accessed here: https://nyco365.sharepoint.com/:x:/r/sites/NYCPLANNING/itd/edm/_layouts/15/Doc.aspx?sourcedoc=%7B79FC4BE4-71E8-4082-B066-4DEA5DECEAA1%7D&file=EDDE_UnitsAffordablebyAMI_2017-2021.xlsx&action=default&mobileredirect=true

fvankrieken commented 1 year ago

Housing Security Outputs:

units_affordable (eli, vli, li, mi, midi, hi) for 2017 - 2021 not being populated in the housing security outputs

units_occupied_renter_1721 not populated but are in the PUMS data sent by winnie

The raw data can be accessed here: https://nyco365.sharepoint.com/❌/r/sites/NYCPLANNING/itd/edm/_layouts/15/Doc.aspx?sourcedoc=%7B79FC4BE4-71E8-4082-B066-4DEA5DECEAA1%7D&file=EDDE_UnitsAffordablebyAMI_2017-2021.xlsx&action=default&mobileredirect=true

Thanks, @mbh329 . There was one big section of logic specific to years in column names that I had missed, in the utils. See this commit

Latest build is now here - other than local testing for units_affordable and units_housing_tenure, I haven't checked any files since these latest changes, but will pick this up in the morning

fvankrieken commented 1 year ago

In QOL, "prematuremortality" columns are missing, and have 2019 in the header which seems like a pretty sure giveaway that something is off

AmandaDoyle commented 1 year ago

Do the versions of the datasets in db-equitable-development-tool/ingest/data_library/datasets.yml need to be updated? I can't quickly tell if these are used or not.

If they're not use, I don't see any reason to note move the tables over to OSE (except for quality_of_life_puma.csv). I've spot check outputs to inputs here and see that the issues above are fixed or in the process of being fixed (namely the NTA issue for the 15 fields in quality_of_life_puma.csv).

damonmcc commented 1 year ago

@AmandaDoyle looks like anytime read_from_s3() is called, it's using the versions declared in the datasets.yml you mentioned

it's currently used in 9 places, all of them in housing_security/ and housing_production/. I think when I made changes to ingest the new transportation data, I replaced any use of s3 files with local files

fvankrieken commented 1 year ago

Putting this here for now, but while things are fresh just wanted to log pain points, things that have gone wrong

editing to add more thoughts