insight-lane / crash-model

Build a crash prediction modeling application that leverages multiple data sources to generate a set of dynamic predictions we can use to identify potential trouble spots and direct timely safety interventions.
https://insightlane.org
MIT License
112 stars 40 forks source link

standardize_crashes date filtering #203

Closed terryf82 closed 5 years ago

terryf82 commented 5 years ago

config file now passed entirely to data_standardization function, in line with other functions standardize_crashes now filters crashes outside of specified range PEP8 formatting applied to modified files

codecov[bot] commented 5 years ago

Codecov Report

Merging #203 into master will increase coverage by 0.02%. The diff coverage is 18.91%.

@@            Coverage Diff             @@
##           master     #203      +/-   ##
==========================================
+ Coverage   45.42%   45.45%   +0.02%     
==========================================
  Files          29       29              
  Lines        2897     2904       +7     
==========================================
+ Hits         1316     1320       +4     
- Misses       1581     1584       +3
terryf82 commented 5 years ago

I can see the difference. Moving to full start & end dates makes sense, but that'll require changes in multiple scripts, right?

As an interim solution, could I change:

..(end_year is not None and crash_year > end_year)

to something like

..(end_year is not None and crash_year > (end_year - 1))

to complete this issue and not have it be a blocker for getting the new refined crash data? Then we could create a separate issue to review start & end date usage throughout the pipeline.

Thanks.