Closed tomalrussell closed 3 years ago
Algorithm notes
Library notes
dijkstra_iterator
which could be used to build up source-to-targets shortest pathsAlso pandana
which uses a contraction hierarchy - can do multiple a-b routes, but not yet single-source shortest path trees, as far as I understand - see notes in https://github.com/UDST/pandana/issues/120
Script scripts/sources_sinks_routing/benchmark_shortest_path.py benchmarks igraph vs pandana for computing many-to-many shortest paths from ensemble S
to T
.
Igraph computes vertex paths directly. Pandana computes vertex paths which must be transformed into edge path. This can be done through igraph.Graph.get_eids
. This adds significant overhead - but faster than looking up edges in the original dataframe based on start and end nodes. Below are the timings for 3 different ensemble size (S
and T
having the same size). All average timings are in seconds and statistic taken over 10 repetitions (random S
and T
ensembles).
python benchmark_shortest_path --nreps=10 --reconstr --sizes 5 10 20
Size | Igraph (avg) | Igraph (std) | pandana (avg) | pandana (std) | reconstr (avg) | reconstr (std) |
---|---|---|---|---|---|---|
5 | 0.74 | 0.06 | 0.41 | 0.03 | 1.24 | 0.09 |
10 | 0.83 | 0.01 | 0.44 | 0.00 | 5.28 | 0.04 |
20 | 0.88 | 0.02 | 0.46 | 0.02 | 21.24 | 0.50 |
Transforming vertex paths into edge paths is too slow (column reconstr
).
Forgetting about shortest path format, here are some timings for larger ensemble sizes (all average times in seconds). Pandana compute vertex paths, igraph computes edge paths.
python benchmark_shortest_path --nreps=10 --sizes 50 300 600
Size | Igraph (avg) | Igraph (std) | pandana (avg) | pandana (std) |
---|---|---|---|---|
50 | 0.83 | 0.01 | 0.44 | 0.01 |
300 | 2.69 | 0.51 | 2.11 | 0.23 |
600 | 6.49 | 0.40 | 7.46 | 0.37 |
igraph seems to scale better than pandana with ensemble size.
Network definition
Define a spatial network graph
G = {V,E}
as a set of nodesV
and edgesE
where each edge connects two nodes.A node can be represented by a
Point
geometry. It can be identified by index in the list of nodes or anode_id
. It may have an associatedPolygon
that represents a site, building, or service area.An edge can be represented by a
LineString
geometry. It can be identified by index in the list of edges or anedge_id
. It hassource
andtarget
attributes referring to its start and end nodes. It may bedirected
(True/False), if flows on the graph should only be allowed to traverse the edge in one direction from source to target.Routing problem
Given two sets of points
S
andT
, find the least-cost route from eachs
to eacht
.Reasonable implementation idea: use single-source shortest paths (Dijkstra) for each
s
inS
, possibly bail early once eacht
has been reached (no need to visit all nodes).See also multi-level Dijkstra or contraction-hierarchy approaches as in OSRM.
An aside on routing costs
The cost of moving over an edge can be provided, whether from input data or computed from other edge attributes (e.g. for roads: length, condition, speed limit).
The cost of moving over a node from one edge to another may not be necessary for our purposes (e.g. turn restrictions or turn penalties, waiting times at stations). For reference, OSRM uses an edge-expanded representation to handle this.
Another approach would be to introduce a complete graph (from each in-edge to each out-edge) at the location of nodes where waiting times are applicable, e.g. for a 2-degree node b with some waiting time 3:
Where the "waiting cost" of 3 at node b gets represented by an edge of weight 3 between b' and b'' (even though b' and b'' are physically represented as in the same location as b)