Closed mbh329 closed 1 year ago
Let me look into the DO issue, I could have sworn I opened up the shapeifles yesterday from this branch
I understand the DO output would be missing unless specifying a version of DOB data but I don't think we really want this change on either dev or main?
My intention here is to have someone review the outputs so we can confirm the files are there, once doublechecked I’ll remove them and you can approve. Does that sound good @td928
I did check them yesterday but don’t think it’s bad to double check
Addresses issue #603. Two reviewers ideally 🌟.
This PR request rectifies an issue from earlier PR's #600 and #586 in which I did not add the latest columns DCP Housing requested to the shapefiles that they use frequently for analysis. In addition to adding the 7 new columns to the
housing.shp.zip
anddevdb.shp.zip
, this PR also addresses an issue with the column names:old_col_name
->old_col_na
). To fix this, we kept the 7 new column names at or below 10 characters.Test In order to test this PR, open up this feature branch locally and run through a build of devdb (dataloading to export), making sure to be cognizant of the final and intermediate tables: _
init_bis_devdb, _init_now_devdb, _init_devdb, mid_devdb, final_devdb, export_devdb, shp_housing
- we want the column names to be consistent throughout. I also ran the Workflow off this feature branch and output can be viewed here: DigitalOcean Output . To test the shapefiles, open them up in a GIS program of your choice and make sure they are as expected.IMPORTANT For a build of devdb to be successful, you MUST specify a specific date in the
01_dataloading.sh
-dob_now_applications 20221001
. In addition, you MUST set theDOB_DATA_DATE=20221025
in theversion.env
. This is due to discrepancies between dob data versions.Larger Issue There are a lot of discrepancies and irregularities in the column structure of devdb (from final tables to intermediate tables) due to the many different groups involved with the creation of DevDB. We should try to force some sort of standardization to the outputs so that we follow necessary column conventions and don't mix column name structures (i.e. snakecase vs camelcase)