insight-lane / crash-model

Build a crash prediction modeling application that leverages multiple data sources to generate a set of dynamic predictions we can use to identify potential trouble spots and direct timely safety interventions.
https://insightlane.org
MIT License
112 stars 40 forks source link

adding latest as point based value aggregation #205

Closed christophercahill closed 5 years ago

christophercahill commented 5 years ago

From https://github.com/Data4Democracy/crash-model/issues/194

Files changed: initialize_city.py standards/points-schema.json data_standardization/standardize_point_data.py data/create_segments

Files soon to be changed: standards/points-instance.json data_standardization/tests/test_standardize_point_data.py

christophercahill commented 5 years ago

Hi, if the first few edits look alright I'll go onto testing. Let me know if there are any other files that would be good to test besides data_standardization/tests/test_standardize_point_data.py

christophercahill commented 5 years ago

Hi, I tested this with the Philly config file (should be uploaded on this branch). It works as in it runs and I responded to the first round of requested changes, but I need to do informal and formal testing (to come very soon!)

j-t-t commented 5 years ago

Logic looks good now for create_segments, although it would be simpler to just say latest_date = max(values_all_dates.keys())

I'd remove the parts of the config file that are commented out, it makes it easier to read. Would also still like to see a link to the site where you found the AADT file and crash files in the config file

christophercahill commented 5 years ago

Just pushed a few of those changes and a test

Data: Crashes: https://www.opendataphilly.org/dataset/vehicular-crash-data Volume: https://www.opendataphilly.org/dataset/dvrpc-traffic-counts

Note: volume has to be sorted manually to only include volume (see column "Type")

christophercahill commented 5 years ago

@j-t-t besides getting rid of those testing pycache dirs, do you see anything else here?

j-t-t commented 5 years ago

It seems reasonable! Once you fix the merge conflict we can see how the tests do

codecov[bot] commented 5 years ago

Codecov Report

Merging #205 into master will increase coverage by 0.13%. The diff coverage is 95.45%.

@@            Coverage Diff             @@
##           master     #205      +/-   ##
==========================================
+ Coverage   54.03%   54.16%   +0.13%     
==========================================
  Files          32       32              
  Lines        3172     3190      +18     
==========================================
+ Hits         1714     1728      +14     
- Misses       1458     1462       +4
j-t-t commented 5 years ago

Looks good! Only two things:

christophercahill commented 5 years ago

Great! I will do both of those -- one q, is the 'test' in the first bullet point referring to initialize_city and related test?

j-t-t commented 5 years ago

For my first bullet point: it looks like you are adding a new config file (src/config/config_philly.yml)? Since initialize_city was changed to include a timezone since you ran initialize_city to create the philly config file, your config file is now missing a required timezone. You can just add the line timezone: America/New_York to config_philly.yml

christophercahill commented 5 years ago

Great, those should be fixed now