jschulberg / DC-Transportation-Crashes

Analysis of transportation-related crashes (car, motorcycle, pedestrian, bike) in the Washington, D.C. area.
0 stars 0 forks source link

Geospatial Clustering: K-Means #6

Open jschulberg opened 1 year ago

jschulberg commented 1 year ago

Use a classical K-Means clustering algorithm to group the data in-terms of the geospatial positioning. In this approach, we pre-set the number of clusters we would expect, and see how cleanly the car crashes separate into respective clusters. The rationale here is that DC has 8 pre-defined wards, so we can see if a ‘K’ value of 8 provides optimal separability in the data, or if a different value of ‘K’ does. We will measure separability of the clusters by looking at an elbow plot of the sum of squared differences between clusters.