NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
14 stars 0 forks source link

Ar facilities automation #832

Closed alexrichey closed 1 month ago

alexrichey commented 1 month ago

This PR does a lot, basically everything required for me to have confidence pushing facilities to Socrata. it's probably easiest to review commit by commit.

Validation

Here's what you'll see, using template_db as an example:

['templatedb_points.shp.zip',
 'COLUMN_MISMATCH',
 "Invalid column(s) found in source data: {'geometry', 'wkb_geomet', "
 "'geometry_e'}."]
['templatedb_points.shp.zip',
 'COLUMN_MISMATCH',
 "Column(s) missing from source data: {'the_geom'}."]
['templatedb_points.shp.zip',
 'INVALID_DATA',
 'Column bbl contains 218 invalid record(s), for example: None']
['templatedb_polygons.shp.zip',
 'COLUMN_MISMATCH',
 "Invalid column(s) found in source data: {'geometry', 'wkb_geomet', "
 "'geometry_e'}."]
['templatedb_polygons.shp.zip',
 'COLUMN_MISMATCH',
 "Column(s) missing from source data: {'the_geom'}."]
['templatedb_polygons.shp.zip',
 'INVALID_DATA',
 'Column bbl contains 2372 invalid record(s), for example: None']
damonmcc commented 1 month ago

awesome! left comments on a couple small things