insight-lane / crash-model

Build a crash prediction modeling application that leverages multiple data sources to generate a set of dynamic predictions we can use to identify potential trouble spots and direct timely safety interventions.
https://insightlane.org
MIT License
112 stars 40 forks source link

Crash standardization should conform to specified start & end years #202

Closed terryf82 closed 5 years ago

terryf82 commented 5 years ago

Presently standardize_crashes.py converts all crashes in the raw input file, regardless of the optional start_year and end_year options in the city config.

Since a user specifying those is only interested in that period, it makes sense for the standardize script to exclude crashes outside of that range.

terryf82 commented 5 years ago

https://github.com/Data4Democracy/crash-model/pull/203

I realised while writing this there are other sections of _datastandardization that should probably be brought into line with the strategy of passing the config file as a param, rather than selective config items.

I'm submitting this PR as is so that @alicefeng can start working on adding crashes to the viz, and not have to filter out those that are unwanted.

Once we have this sorted, I'll go back and improve upon the remainder of _datastandardization.

terryf82 commented 5 years ago

P.R updated with changes to treat end_year as 01-01-end_year -

https://github.com/Data4Democracy/crash-model/pull/203

terryf82 commented 5 years ago

This has been merged into master now.

@alicefeng if you run the pipeline for any city with a start_year and end_year in its config, the crashes.json file should only include that period. Let me know if you need a hand, thanks.