NYCPlanning / data-engineering-qaqc

streamlit app for data engineering
https://edm-data-engineering.nycplanningdigital.com
1 stars 0 forks source link

cpdb admin boundary #128

Closed td928 closed 2 years ago

td928 commented 2 years ago

85

Overview

Comparing the admin boundaries values from the admin_boundary_type field from the cpdb_adminbounds table is stable from version to version. This is something that can tell us about whether the spatial join worked properly or not.

adminbounds.py

not the most satisfying output in that the intended result is the list of unique values are the same so no values would be displayed.

data ingestion

I think the more interesting part of this PR might be the refactoring work on the data ingestion. To incorporate the new tables from the cpdb output and handle data structures in the output directory. New design allows the more tables to be added to the data dictionary as more tables are needed in the future as well.

Oysters1874 commented 2 years ago

lgtm, works locally. I think this can be merged at the moment.

td928 commented 2 years ago

refactored into components in the commit above. Should still work as before. Please let me know if the approval stands. Thanks! @Oysters1874 @abrieff

Oysters1874 commented 2 years ago

okay, it looks good to me. So for my part, should I also put it in the component folder?

abrieff commented 2 years ago

still stands 👍 and @Oysters1874 i would say if it's easy enough go for it