NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
14 stars 0 forks source link

improve CPDB table `cpdb_adminbounds` #802

Closed damonmcc closed 2 weeks ago

damonmcc commented 3 weeks ago

related to #779

successful build here. all builds on this branch here

The table cpdb_adminbounds has duplicate records.

The records count of the table decreased from ~120K to ~103K. I checked a project which is an example of the issue (feature_id = '039LQNFPFAO') and the duplicates are gone.

Would love to add a test (dbt) to enforce our expectations and improve the queries that go into this table, but GIS needed this table ASAP and we can iterate.

screenshots

Before Screenshot 2024-04-25 at 3 39 36 PM

After Screenshot 2024-04-25 at 3 40 00 PM