Closed mguzmann closed 2 years ago
The easiest way to do that is to use the `merge_directed_graph() function. That will make all edges undirected, and you'll get a symmetrical distance matrix.
Note, however, that the matrices generated from directed graphs never violate the triangle inequality. They are merely asymetric, so that d(A,B) will not necessarily equal d(B,A). But it will always be the case that d(A,B) + d(B,C) >= d(A,C), which is the triangle inequality. If potential violation of that is really your concern, you don't need to worry. But if asymmetric matrices are your actual concern, then just "merge" the directed graph.
Thanks for your answer. Maybe I'm doing something wrong, but it seems like my results are not respecting triangle inequality. This example works for me (to produce matrices which violate tri ineq):
library(tidyverse)
library(dodgr)
library(osmextract)
library(sf)
## Adapted from the fossil package
tri.ineq.show <- function (D)
{
mat <- as.matrix(D)
n <- dim(mat)[1]
ineq.idx <- c()
for (i in 1:(n - 2)) {
for (j in (i + 1):(n - 1)) {
for (k in (j + 1):n) {
sds <- c(mat[j, i], mat[k, i], mat[k, j])
lng <- max(sds)
if (lng > (sum(sds) - lng)) {
# cat(sprintf("Check triangle %d %d %d\n",i,j,k))
ineq.idx <- rbind(ineq.idx,c(i,j,k))
}
}
}
}
return(ineq.idx)
}
dat <- oe_read("./data-berlin/berlin-latest.osm.pbf")
graph <- weight_streetnet(dat, wt_profile = "foot")
set.seed(1234)
from <- sample (graph$from_id, size = 100)
d_dist <- dodgr_dists(graph
, from = from
, to = from
, shortest = TRUE)
fossil::tri.ineq(d_dist[1:30,1:30])
fossil::tri.ineq(d_dist[1:10,1:10])
tri.ineq.show(d_dist[1:30,1:30])
d_dist[c(1,15,28), c(1,15,28)]
The symmetric part is less important to me, I could just copy the upper or lower triangle, but these triangle inequalities really wreck havoc in my models.
For the Berlin data:
wget http://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf
Yep, i can confirm that you're right here. There are several issues here, starting with dodgr
generating inaccurate distance calculations to begin with, which the following code fixes by modifying all distances to the most accurate versions using an sc
-format network and using geodesic calculations. But that then reveals (with a simplified triangle inequality checker) that the inequality is indeed violated.
library (dodgr)
x <- readRDS ("/<path>/<to>/berlin-sc.Rds")
graph <- weight_streetnet (x, wt_profile = "foot")
#> Loading required namespace: geodist
#> Loading required namespace: dplyr
# manually change all edge distances to geodesic
xy0 <- graph [, c (".vx0_x", ".vx0_y")]
xy1 <- graph [, c (".vx1_x", ".vx1_y")]
wt <- graph$d_weighted / graph$d
graph$d <- geodist::geodist (xy0, xy1, paired = TRUE, measure = "geodesic")
graph$d_weighted <- graph$d * wt
set.seed(1234)
from <- sample (graph$.vx0, size = 100)
d_dist <- dodgr_dists(graph
, from = from
, to = from
, shortest = TRUE)
tri.ineq.show <- function (D)
{
mat <- as.matrix(D)
n <- dim(mat)[1]
ineq.idx <- c()
for (i in 1:(n - 2)) {
for (j in (i + 1):(n - 1)) {
for (k in (j + 1):n) {
long <- mat [i, j] + mat [j, k]
short <- mat [i, k]
if (is.na (short) | is.na (long)) {
next
}
if ((short - long) > 1e-6) {
ineq.idx <- rbind (ineq.idx, c (i, j, k))
}
}
}
}
return(ineq.idx)
}
ineq <- tri.ineq.show (d_dist)
nrow (ineq)
#> [1] 1035
i <- ineq [1, 1]
j <- ineq [1, 2]
k <- ineq [1, 3]
c (d_dist [i, j], d_dist [j, k], d_dist [i, j] + d_dist [j, k], d_dist [i, k])
#> [1] 4329.437 24118.779 28448.216 28514.949
Created on 2022-05-02 by the reprex package (v2.0.1)
I'll fix up the internal distance calculations first, then have a look at why the inequality is violated. Thanks for uncovering this :rocket: :heavy_check_mark:
Finally, note that one further issue with your approach is that the osmextract
package relies on GDAL via sf
to read the data, and that actually strips all the OSM information necessary to construct a proper dodgr network. So you'll never get accurate distances via osmextract
. What you can do is use osmium
to convert the .pbf
to .osm
/.xml
format, and then use osmdata
to read that in.
Finally getting back to this issue. I realised in the meantime that this behaviour is actually perfectly normal and expected. dodgr
implements weighed routing, yet returns the absolute, unweighted distances. The weighted distances A -> B plus B -> C will always be >= A -> C, but the absolute final distances may well be less. The weighting profiles reflect preferences for different kinds of ways. The best weighted path connecting A -> C may not necessarily be the shortest overall, and routing through some other mid-point B may indeed give shorter overall journeys. The only thing that can be stated with certainty is that those journeys will have longer weighted distances.
This suggests that a solution in cases where triangle inequalities should hold is to use something like wt_profile = 1
, and so not to weight each edge. But note even then that if using the full-detail weighting on SC-format networks, you still get the effects of things like waiting times at traffic lights, or directional turning penalties. So even then while that will mostly produce values of d_weighted
that are equal to equvalent direct distances, d
, that will not always be so, and you'll still end up with some d_weighted > d
. Those will generally reflect compound junctions with complicated turning rules, and then you may still see the trinagle inequality violated even when routing on an unweeighrted network.
This is a good thing, and reflects the routing engine doing what is expected. Realistic routing may always run the risk of violating tinangle inequalities. Even if you attempt to reduce as much as possible to some kind of "pure" geometric network (for example, wt_profile = 1
and turn_penalty = FALSE
), you may still violate geometric constrainsts because of unknown prohibitions within the network such as one-way rules.
So all is good, and i'll close this now, but feel fre to add additional comments @mguzmann. Thanks for prompting these important insights.
I am working with the Berlin pbf file: http://download.geofabrik.de/europe/germany/berlin-latest.osm.pbf If I calculate distances between random nodes in the graph, the resulting matrix is not symmetric. If I force that matrix to be symmetric, it violates triangle inequality.
I guess this is because of some routing decisions which make going from A to B different than going from B to A. It would be very useful to be able to ignore directionality and get a symmetric matrix that respects triangle inequalities. Is there a work around?