NYCPlanning / data-engineering-qaqc

streamlit app for data engineering
https://edm-data-engineering.nycplanningdigital.com
1 stars 0 forks source link

Fvk gru expand source versions #267

Closed fvankrieken closed 1 year ago

fvankrieken commented 1 year ago

Expands source versions table. Currently looks like this

image

for doitt buildin footprints date is fine, but saf should actually have a "23a"-like version. So I grab that now instead when the file was last modified. I also grab the timestamp of archival from execution_details (if present since its relatively new) and put that there for data library inputs. Now looks like this

image

fvankrieken commented 1 year ago

lgtm! curious why not use the "latest folder" approach for all other datasets? guessing it's because the config.json is often more reliable than the folder name for the version format we want for each dataset

Yeah I figure the config file is the most consistent and reliable way to get metadata, especially with the execution_details bein dumped there now. dcp_saf is not in recipes and doesn't go through data library machinery so we don't get that file, hence the special case.