GreenInfo-Network / nyc-crash-mapper-etl-script

Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.
3 stars 0 forks source link

Past records not in any assembly district #39

Closed gregallensworth closed 1 year ago

gregallensworth commented 1 year ago

Discovered while updating polygon data in #38

The crashes_all_prod table has 2,071,732 crashes in it today, of which 205,560 (about 10%) have NULL for assembly

That's about 10% which "feels" like a lot to fall outside of any assembly district but (presumably) fitting into the five boroughs.

I should check into this,

┆Issue is synchronized with this Asana task

gregallensworth commented 1 year ago

Initial overview

I exported a shapefile of the 205,560 crash-points SELECT * FROM crashes_all_prod WHERE assembly IS NULL

And I exported the assembly as GeoJSON. This is the old 2015 shapes, but that fits our intent, finding why the past records have been coming up this way.

Initial impressions are that this is indeed an edge effect. But many of them seem as if they would not be solved with a simple buffer (even if CARTO would let that happen without timing out).

Overview

image

Plain Ol' Edges Example

image

Bridges Example

image

Inconsistent Bridge Coverage

image

danrademacher commented 1 year ago

Further discussion, if we fixed this by grabbing the unclipped versions, it would open a whole can of worms with other layers:

None of the 7 offered polygons include bridges. For for all 7 it’s similar:

We don't have budget for this right now. Let me check with client and inform her about this before we close this out as wont-fix

danrademacher commented 1 year ago

OK, just chatted with Christine and she agreed with leaving this as is