Chicago / vision-zero-dashboard

Vision Zero Dashboard
MIT License
14 stars 20 forks source link

Distance Metrics in the data sets #20

Open aginensky opened 5 years ago

aginensky commented 5 years ago

I want to understand what distance metrics you have used in the scripts. I have found a couple of packages that will turn latitude longitude data into distances. If you have a preferred package, I can just use that. Secondly, once a distance package has been installed. I wanted to do some sort of 'clustering' to determine how many unique accident sites there are. I've done nothing yet, but I'm wondering if there are instances in which distinct locations are actually 10 ft apart- which likely means they are the same site. It would also be interesting to see accidents by some sort of grid search. For example write code to see how many accidents are within 10/50/100/250 feet of a given accident. Thoughts ? Suggestions ?

sas1336 commented 5 years ago

I think this is a great idea. I would also try to understand how we get the coordinates in the first place. Until recently, i,e. till 2017 - IDOT would geocode the crash locations. I myself am not totally clear on this but as I understand, the geocoding is done according to addresses that reporting officers put in. Thus, if there is a long block with only one address, are all the crashes that happen along that block plotted to only one location? Will do a deeper dive on this question myself too.

hneaz commented 5 years ago

I have worked with spatial data before at my current role. I used geosphere package to use the Haversine distances as well as the RANN package to do K Nearest Neighbor in order to cluster locations to the nearest points. This is an interesting analysis to look into. I might be able to help since I have some experience.

aginensky commented 5 years ago

I'm happy to collaborate with any or all.

On Tue, Nov 19, 2019 at 12:53 PM Hasib Neaz notifications@github.com wrote:

I have worked with spatial data before at my current role. I used geosphere package to use the Haversine distances as well as the RANN package to do K Nearest Neighbor in order to cluster locations to the nearest points. This is an interesting analysis to look into. I might be able to help since I have some experience.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Chicago/vision-zero-dashboard/issues/20?email_source=notifications&email_token=ABX4X2AFFSNKF7KEGNSOUCTQUQY3NA5CNFSM4JOZWG62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEPJZPI#issuecomment-555654333, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABX4X2FBVS7U4UNMJD4EJKTQUQY3NANCNFSM4JOZWG6Q .

hneaz commented 5 years ago

@aginensky I got the longitude and latitude data for the hospital list provided by @sas1336 using SmartyStreets API. Another question to look into is the distance between the hospital and the accident sites, measure the time to response, etc.

Here is the file. illinois_hospital_list.csv.zip