UDST / pandana

Pandas Network Analysis by UrbanSim: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:
http://udst.github.io/pandana
GNU Affero General Public License v3.0
385 stars 84 forks source link

incorrect shortest paths solutions when impendance values are particularly small or large #179

Open bonza2 opened 2 years ago

bonza2 commented 2 years ago

Description of the bug

I am encountering incorrect shortest path solutions when using the Network.shortest_path (and Network.shortest_paths) method on a network with impedance values of small or large (though still reasonable) orders of magnitude. I have included a very simple toy example below that captures the problem. Essentially, when the impedance values are scaled to a sufficiently small order of magnitude (by experiment seems to be ~1e-4) or large order of magnitude (by experiment seems to be ~1e7), the shortest path solution is no longer correct (and, incidentally, appears to revert to a solution that traverses the least number of nodes, rather than minimizing total impedance, though my example doesn't explore this aspect of the problem).

Environment

Paste the code that reproduces the issue here:

import numpy as np
import pandana as pdna
import pandas as pd

nodes = [1, 2, 3, 4, 5]
nodes = pd.DataFrame(
    np.random.uniform(size=(5, 2)), columns=["x", "y"], index=nodes
)

edges = pd.DataFrame(
    [
        [1, 2, 0.001],
        [2, 3, 0.001],
        [3, 4, 0.001],
        [1, 5, 0.002],
        [5, 4, 0.002],
    ],
    columns=["from_node", "to_node", "weight"]
)

net = pdna.Network(nodes["x"], nodes["y"], edges["from_node"], edges["to_node"], edges[["weight"]], twoway=False)

net.shortest_path(1, 4)

>>> array([1, 2, 3, 4], dtype=int64)

The above example produces the expected solution.

Now, use exactly the same nodes and edges, but scale the impedance values down by a factor of 10:

net = pdna.Network(nodes["x"], nodes["y"], edges["from_node"], edges["to_node"], edges[["weight"]]  / 10, twoway=False)

net.shortest_path(1, 4)

>>> array([1, 5, 4], dtype=int64)

In the second example, clearly the more expensive and incorrect route has been chosen. The same behavior is observed at the larger end of impedance values also: for example, using the above example, when the impedance values are 1e7 instead .0001, and 2e7 instead of .0002, the same behavior occurs. I'm at a loss as to why this is happening?

Thank you for your help!

stefancoe commented 1 year ago

I am experiencing the same issue. The returned distance between two nodes of interest is about a mile shorter than the Euclidian distance. NetworkX is able to return a distance greater than the Euclidean distance using the same network. Happy to post code and data if that helps.