NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
14 stars 0 forks source link

new GRU QAQC report #812

Open damonmcc opened 2 weeks ago

damonmcc commented 2 weeks ago

Geographic research would like to add a new QAQC report to the portfolio of existing GR QAQC reports.

This new report would compare Historic Footprint BIN vs PAD BIN. This is essentially a version of the existing BIN vs PAD BIN, but using historical data.

Geographic Research would like to be able to run this new report after 24B is released on May 5 to clean up the data ahead of the 24C release.

This is the best place for DE to get historical footprint data (already in edm-recipes)

fvankrieken commented 4 days ago

This will need stuff to happen in a couple places

Running of GRU QAQC happens in another repo : https://github.com/NYCPlanning/db-gru-qaqc. This could be moved at this point, because I've removed most of the automation around issues and github actions within the repo in favor of the page on the streamlit app. But that doesn't need to happen now.

In that repo, we'll need

Then, in the qa app, we'll need this added as an option for GRU. I think this is largely abstracted, and there's just a csv/dataframe that will need a row added for this test. This should add a row to the dashboard table.

If this is a new source dataset, it should also be added to the table of source data on the page, showing its last date of archival and latest version

fvankrieken commented 4 days ago

This test should be called historic-footprints-vs-pad