NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
14 stars 0 forks source link

lots missing from GFT source data exports #815

Closed damonmcc closed 2 weeks ago

damonmcc commented 2 weeks ago

problem

For many datasets we:

  1. get points (raw)
  2. replace with a lot polygons if one intersects (intermediate)
  3. buffer the lots and remaining points (buffered)

For these datasets we export the point, lot, and buffered geometries (e.g. CATs permits, NYC Historic Buildings) so that they can been seen in the GFT app.

GIS noticed that, although the buffers are correct, some source_xxx_lots tables are missing lots.

damonmcc commented 2 weeks ago

I suspect this is due to our use of WHERE ST_GEOMETRYTYPE(raw_geom) = 'ST_MultiPolygon' is tables like sources__nyc_historic_buildings_lots

When inspecting with GIS, the lots that did make it through appeared to all be MultiPolygon.

Since the geometry type of all records in an FGDB layer must all be the same, we should either

  1. cast all polygons to MultiPolygon before selecting by geometry type
  2. or add columns at to represent these "staged" of geometries so we can select points and lots using column names rather than by geometry types