UrbanAnalyst / dodgr

Distances on Directed Graphs in R
https://urbananalyst.github.io/dodgr/
128 stars 16 forks source link

dodgr_times() output different on contracted vs. non.contracted graph #214

Closed jucardwell closed 6 months ago

jucardwell commented 1 year ago

Hi,

I have run dodgr_times(shortest= FALSE) on the same graph in it's contracted and non-contracted form and have gotten slightly different output. Seems to me that it should be the same, regardless if the graph is contracted or not? Any ideas as to what might be causing the small differences?

mpadge commented 1 year ago

Times will generally differ by small amounts after graph contraction, because contracted versions include time penalties for turning across complex street junctions. These reflect things like waiting at traffic lights, and time penalties for turning across oncoming traffic. The uncontracted version do not include such penalties, so times will generally differ.

jucardwell commented 1 year ago

Thanks @mpadge. Just for clarification, I was under the impression this was only for the sc-based networks, not sf based. Is it true for both?

mpadge commented 1 year ago

Ah, yes, that should indeed NOT occur for sf-based networks. Here is an example of time-based routing where the blue line is the contracted version of the full graph / red line: image So the differences reflect fundamentally different routes taken along contracted versus non-contracted graphs. I suspect that these very occassional and minor differences simply reflect rounding errors in the routing algorithms, such that the edge times along the contracted graph are not numerically identical to the equivalent sums of times along the original graph, but am still looking further and will get back to you.

mpadge commented 1 year ago

further insight, based on https://www.openstreetmap.org/way/251912836, the contraction of which involves adding numerous individual sections comprising a long curve:

library (dodgr)
packageVersion ("dodgr")
#> [1] '0.3.0.1'
f <- "mannheim-sf.Rds"
if (!file.exists (f)) {
    net <- dodgr::dodgr_streetnet ("mannheim germany")
    saveRDS (net, "mannheim-sf.Rds")
} else {
    net <- readRDS (f)
}
ids <- net$osm_id [grep ("Am\\sAubuckel", net$name)]
graph <- weight_streetnet (net, wt_profile = "motorcar")
#> The following highway types are present in data yet lack corresponding weight_profile values: platform, proposed, busway, construction, corridor, raceway, bus_stop, NA, elevator, rest_area,

# This gets total length of all segments on "Am Aubuckel":
ids0 <- which (graph$way_id %in% ids)
t0 <- sum (graph$time [ids0])
edges0 <- graph$edge_id [ids0]

# This re-maps contracted graph back on to original, to demonstrate that
# lengths remain the same:
graphc <- dodgr_contract_graph (graph)
edge_map <- readRDS (fs::dir_ls (tempdir (), regexp = "edge\\_map"))
index0 <- which (graphc$edge_id %in% edges0)
index1 <- which (edge_map$edge_old %in% edges0)
ids1 <- c (graphc$edge_id [index0], edge_map$edge_old [index1])
ids1 <- sort (as.integer (ids1))
index1 <- match (ids1, graph$edge_id)
t1 <- sum (graph$time [index1])

print (t0); print (t1)
#> [1] 266.4411
#> [1] 266.4411

# This gets the lengths of equivalent contracted edges:
index1 <- which (edge_map$edge_old %in% edges0)
enew <- edge_map$edge_new [index1]
enew <- unique (c (enew, graphc$edge_id [which (graphc$edge_id %in% edges0)]))
index2 <- match (enew, graphc$edge_id)
t2 <- sum (graphc$time [index2])
print (t2)
#> [1] 267.6206

Created on 2023-08-08 with reprex v2.0.2

And the sum of distances along the curve is then longer that the original sum of distances. That curve also forms part of the route on the non-contracted graph, but not in the contracted version. The slightly longer distance would then explain that, focussing the question here on why exactly that distance is longer ... ?

mpadge commented 6 months ago

Finally solved this. The discrepancy happens because of these lines: https://github.com/UrbanAnalyst/dodgr/blob/3bcb1b6ecf65e7fd611909b2971c39d61722a5f6/R/weight-streetnet-times.R#L214-L221

In the initial graph construction, times are calculated based on maximum speed along each segment. For motorcar routing, maximum speeds vary greatly in different locations, so maxspeeds for any ways which do not specify them are are estimated from the OSM data as median maximum speeds of all ways in an entire network for each way type. This is then data dependent, and can produce sampling discrepancies in maximum speed estimations, which can very occasionally lead to effects like those observed here.

The effects can be removed by specifying a weighting profile, as described in this vignette. But the whole problem is so rare and minor that I think that documenting this effect would likely only add confusion to things. That said, I'm going to close now, but as always feel free to ask any more questions. Thanks for bearing with me for the long time it took me to get to this!