insight-lane / crash-model

Build a crash prediction modeling application that leverages multiple data sources to generate a set of dynamic predictions we can use to identify potential trouble spots and direct timely safety interventions.
https://insightlane.org
MIT License
112 stars 40 forks source link

Filter crashes outside of city boundaries #174

Closed j-t-t closed 4 years ago

j-t-t commented 5 years ago

For crash datasets that are on a county level, we'd like to exclude crashes outside of the city polygon.

First, a city polygon should be written to the city's data directory, e.g. data/boston/processed/maps in geojson format (in 4326 projection). That way, we won't have to get the city polygon twice.

The city polygon is sometimes generated in src/data/osm_create_maps.py (once the buffer_polygon branch is merged into master), if the city needed buffering. I think we should get the polygon even if the city didn't need buffering, and then if there's no city polygon in open street maps, also get a polygon of circle with radius around the center point given.

Then, in util.read_records, add an optional polygon argument (or maybe polygon filename), and only add records whose points are within the polygon (you'll need to reproject into 3857 projection to do this check).

Then in join_segments_crash_concern, there are two calls made to read_records. One is with record_type 'crash' and one with record_type 'concern'. You'd modify the call with record_type crash to take the polygon.

j-t-t commented 4 years ago

Ended up duplicating this issue here, and it is now fixed: https://github.com/Data4Democracy/crash-model/issues/274