nguyenuy / healthsee

An area Healthscore Calculator
2 stars 0 forks source link

Define how to rank a healthscore access for a zip code #1

Open nguyenuy opened 5 years ago

nguyenuy commented 5 years ago

This may require addition of datasets depending on how we want to weigh a given area. Let's start with something simple from here and build from there.

Healthscore access critieria

  1. Distance to relevant health facilities
  2. Closures of nearby health facilities in the past year(s)
  3. Number of beds given a population size/density - this may be a little bit harder to determine because we're limiting the area size and thus not including the entire general population
dewhite commented 5 years ago

I think distance to relevant health facilities can be calculated using this library: https://pypi.org/project/geopy/

However, the computation from a given address to all other addresses is probably a little slow. To address this issue I think we should use the follow approach: 1) Precompute all of the long, lat of the facilities (maybe these are already in the data?) using something like geopy 2) Put these into a KD-Tree locally using scipy.spatial.KDTree

Then to compute the nearest neighbors for a given address you can do the following: 1) For each new address entered, compute the x, y location using geopy 2) Use the KDTree to get the top n closest neighbors, or get neighbors within some cutoff distance

Let me know your thoughts, and if this solves the issue.

nguyenuy commented 5 years ago

Given the time constraints for the week, I think we will have to bypass obtaining the exact facility distance. This will skew our numbers. The geopy library has to retrieve the coordinates of an address using a network call which is a bottleneck in itself.