Open Oysters1874 opened 2 years ago
Do you have a sense of what of this work should be done on the pipeline vs. app side?
Do you have a sense of what of this work should be done on the pipeline vs. app side?
Yea, I can mark that as well. But so far, I think all of these existing QAQC tables are uploaded to DO. For invalid records, mismatches, and version-to-version comparison, we can directly display them on the app side.
👍
The reports that you have outlined are very useful:
The following may not be so useful:
This is more general, but for COLP and other data products we like to check that all of the geospatial values are in sync, for example does the first number in the BIN, BBL, and CD match the boro code, perhaps there is a way to incorporate this type of check into COLP QAQC and think about how to design it so that it is easy to replicate across data products.
I'm happy to meet to talk anything though if helpful. Looking forward to seeing this
Proposed Works
Graphs -- App Side
Display number of records by agency/usetype.
Tasks:
Version-to-version comparison -- App Side (and Maybe COLP Side)
We can display version-to-version changes in the number of records per use type. As the table already exists, we can only look at the app side.
We can follow the design of the CPDB page for this section.
Outlier Report -- App Side
With two existing qaqc tables, ipis_modified_hnums & ipis_modified_names, we can display the records with relevant fields with modified house numbers and parcel names
Geospatial Check -- Both COLP and App Side
Check whether all properties are within NYC borough boundaries.
Manual Corrections Check - App Side
We can display graphs and dataframe of Manual Corrections Applied and Not Applied by field, just like what PLUTO has done.
Current QAQC tables:
- Identifying invalid data in IPIS:
- Version-to-version comparison for COLP review: