NYCPlanning / db-data-library

📚 Data Library
https://nycplanning.github.io/db-data-library/library/index.html
MIT License
0 stars 1 forks source link

Update data products using `dcp_mappluto` to `dcp_mappluto_wi` #316

Open mbh329 opened 1 year ago

mbh329 commented 1 year ago

Currently we have 3 MapPLUTO (dcp_mappluto, dcp_mappluto_wi, dcp_mappluto_clipped) related datasets that are ingested via data-library. In order to follow established patterns seen in our other datasets, we need to update dcp_mappluto to be the version that does not have water included boundaries (as of now, it is identical to dcp_mappluto_wi). To do this, we will need to change any instance in which one of our data products uses dcp_mappluto to dcp_mappluto_wi UNLESS otherewise specified...meaning..does the data product need to have the water included version or does it actually benefit from having a clipped version of the dataset. Each data product will have to carefully examined to determine the right version of MapPLUTO but generally they should include water as the larger boundaries help identify features that might not be included to a strictly land based boundary.

Data products/scripts using dcp_mappluto:

Current Repo's:

NYCPlanning/db-pluto

NYCPlanning/db-data-library

NYCPlanning/db-facilities

NYCPlanning/db-cpdb

NYCPlanning/db-developments

NYCPlanning/db-colp

NYCPlanning/action-library-archive

LABS

NYCPlanning/labs-search-api

NYCPlanning/labs-layers-api

NYCPlanning/labs-geosearch-pad-normalize

NYCPlanning/labs-zola

NYCPlanning/labs-ember-search

NYCPlanning/labs-applicantmaps

NYCPlanning/labs-streets

GIS

NYCPlanning/gis-mappluto-convert

OLD/Deprecated

NYCPlanning/dob-permits-geocode

NYCPlanning/db-housing

NYCPlanning/db-facilities-old

NYCPlanning/action-library-archive

NYCPlanning/db-civic-data-loader

NYCPlanning/data-loading-scripts

NYCPlanning/db-airflow-dags

NYCPlanning/db-data-recipes

actions-marketplace-validations/NYCPlanning_action-library-archive

NYCPlanning/db-pad

AmandaDoyle commented 1 year ago

@mbh329 Thanks for putting this together. This is quite far reaching. Based on this we cannot change dcp_mappluto in data library to point to the clipped version of mappluto anytime soon. Here's what I see as the logical steps to update dcp_mappluto to dcp_mappluto_wi to be in line with other datasets in data lib: 1) Update DE repos where dcp_mappluto is being pulled in via bash/dataloading.sh step, which looks like 3 repos: 1) db-facilities, db-cpdb, and db-developments 2) Investigate where / how OSE and GIS are pulling dcp_mappluto from and what would the implications be if we update data library? 3) Investigate other DE repos where dcp_mappluto is mentioned, but not via a dataloading step and determine what needs to change and map out any downstream impacts. 4) Ignore any repo that's archived. We can start on step 1 as a maintenance task whenever we want to, and we can schedule as a project. 2 and 3 may take a little more time. What do you think?