UDST / pandana

Pandas Network Analysis by UrbanSim: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:
http://udst.github.io/pandana
GNU Affero General Public License v3.0
385 stars 84 forks source link

[Question] Find nearest POI to POI from another category #146

Open majdal opened 3 years ago

majdal commented 3 years ago

Hello - Thanks for the great work on this package!

I just spent a couple of hours digging through the code of pandana to find an answer to this question, but couldn't find any solutions. I'd appreciate any help!

The problem that I have is what follow: I have a number of buildings (node POIs), and a number of schools (node POIs). I want to find the nearest school to each building, using the network from OSM.

It's possible to accomplish this through a nearest neighbour look-ups using scipy's KDTree or scikit-learn's BallTree, but those provide euclidian or great circle distances, and what I want is the network distance. Is there a suggested way to go about this using pandana?

In my digging, I found the find_all_nearest_pois function, but that gives me the nearest POI for the network's nodes, not the nodes from a category that I specify.

Many thanks!

smmaurer commented 3 years ago

Hi @majdal,

My guess is that the brute force approach of calculating the distance from every building to every school -- and then selecting the closest one -- would actually work pretty well here.

Since the recent v0.5 update, point-to-point network distances are super fast to calculate. Section 2 of this notebook has some code examples: Pandana-demo.ipynb

It's on our wish list to expose direct functionality like you're describing, but right now I think it's buried deep inside the graph traversal algorithms.

Something here might be helpful too: shortest-paths.ipynb. This notebook uses NetworkX instead of Pandana to calculate distances, which guarantees you're getting exactly the shortest path but is much slower. The last cell has an example of selecting the closest destination for each origin.

One final suggestion -- if there are so many buildings (millions, say) that the pure brute-force approach is too slow, you could add a preprocessing step where you use euclidean distances to identify a handful of candidate schools for each building, and then use Pandana to figure out which has the shortest network distance.

Hope this helps, and best of luck!

smmaurer commented 3 years ago

Actually, it might be easier and/or faster to just calculate the closest school to each network node, and then match the buildings that you're interested in to the relevant nodes.

The function you linked to is actually in the underlying C++ code -- here's the corresponding Python function: http://udst.github.io/pandana/network.html#pandana.network.Network.nearest_pois

fscottfoti commented 3 years ago

But find_all_nearest_pois https://github.com/UDST/pandana/blob/dev/src/cyaccess.pyx#L101 gives you the distance to one of your categories (say schools) for all nodes in the network, so you have the entire surface of nearest POIs for the whole space of the network. So you map the buildings to nodes (using get_node_ids) and just do a loc on the DataFrame returned by find_all_nearest_pois right?

I don't think the method Sam mentions gives you any more precision since it will still be node-to-node paths. The advantage of Sam's method is you will get the paths not just which school it is and the distance. Like Sam says, I'd just run shortest path queries between buildings and schools if you need the paths.

On Tue, Sep 8, 2020 at 10:31 AM Sam Maurer notifications@github.com wrote:

Hi @majdal https://github.com/majdal,

My guess is that the brute force approach of calculating the distance from every building to every school -- and then selecting the closest one -- would actually work pretty well here.

Since the recent v0.5 update, point-to-point network distances are super fast to calculate. Section 2 of this notebook has some code examples: Pandana-demo.ipynb https://github.com/UDST/pandana/blob/dev/examples/Pandana-demo.ipynb

It's on our wish list to expose direct functionality like you're describing, but right now I think it's buried deep inside the graph traversal algorithms.

Something here might be helpful too: shortest-paths.ipynb https://github.com/smmaurer/cp255/blob/master/07-street-networks/shortest-paths.ipynb. This notebook uses NetworkX instead of Pandana to calculate distances, which guarantees you're getting exactly the shortest path but is much slower. The last cell has an example of selecting the closest destination for each origin.

One final suggestion -- if there are so many buildings (millions, say) that the pure brute-force approach is too slow, you could add a preprocessing step where you use euclidean distances to identify a handful of candidate schools for each building, and then use Pandana to figure out which has the shortest network distance.

Hope this helps, and best of luck!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UDST/pandana/issues/146#issuecomment-689028101, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOHNICXJNSXNZWC2R2SKMTSEZS6HANCNFSM4Q7KPQRQ .

majdal commented 3 years ago

Thanks @smmaurer and @fscottfoti for your very quick replies!

I like Sam's suggestion, of finding the nearest school to each node, then matching the buildings with nodes. I'm still not sure what would be the best way of connecting the building to a node.

One (awkward) way of doing so would be to also run the nearest_pois but with category = buildings, then joining the two two dataframes by node, and removing duplicate buildings. But this seems overly cumbersome.

fillipefeitosa commented 3 years ago

@majdal

Hey friend, I got a similar problem and solved using OSMnx to get the "real distances" based on the street network and got the nearest using KDtree.

I had to find the nearest schools to a set of polygons centroids. To understand better check this notebook maps .

basically the algorithm will stretch the graph of the urban network.

nA = np.array(list(zip(polygon.geometry.x, polygon.geometry.y)) )
nB = np.array(list(zip(schools_preEscolar.geometry.x, schools_preEscolar.geometry.y)) )
btree = cKDTree(nB)
dist, idx = btree.query(nA, k=1)
polygon['place'] = idx
polygon['dist'] = dist
CWen001 commented 3 years ago

Hello. Is this feature on a development roadmap, please?