Closed fvankrieken closed 1 year ago
Looks like run instructions in the README for ZTL is outdated too - will update in PR for this issue
@fvankrieken @athursland
with Ali's PR https://github.com/NYCPlanning/data-engineering/pull/109 merged, do we wanna drop the tasks around changing old files and tables and close this?
I think we should update s3 folders to align with this
renamed and moved all folders in edm-publishing/db-zoningtaxlots
to have the pattern YYYY-MM-01/output/...
ZTL uses
%Y/%m/01
as date format for versioning, as written here (if link is broken, changes have been merged and it's one of the only lines in the file). In s3 this creates unnecessary subfolders by month and day. This same version format is used inEDM_DATA
for archives which get used in ztl build to generate qaqc outputs. I would propose%Y-%m-01
to align a bit better with other productsEDM_DATA
data to do something more postgres friendly than having dashes (maybe just remove dashes), both for writing and readingEDM_DATA
versions, both in table names and column values in the 3 qc tableslatest
folder which doesn't have this issueBeyond that, maybe worth going to the branch/date format we use for most repos, though maybe that can wait for another issue where we aim to align all data products, because many differ just barely (output/no output, whether latest is in main or one level above it, etc)