Closed chrijo closed 6 years ago
Thanks for opening this issue - it has revealed quite a few minor bugs in dodgr
code. None that yet related directly to your query, but I'll have to iron them out first before addressing it directly. The distances themselves should nevertheless be accurate, although the were inaccurate in previous versions, so maybe make sure you've got the latest dev
version installed (devtools::install_github("atfutures/dodgr")
)? But note at the moment that doing that will likely fail to generate distances for several of these - a problem which I hope to fix asap.
Initial comments on your query prior to that:
dodgr
, and could be tweaked to closely agree with the google API, but that's not really the point of this package.Dear Mark, thanks for your fast and accurate reply. I installed the dev-version, however the output of dodgr_dists() now is NA. I guess it has to do with the bugs you've mentioned. So I'm looking foward for a improved version of your package! Please let me know when it's fixed. Btw do you know the python module OSMnx which seems to have a similar approach to routing on osm networks and it has a nice feature of network simplification by reducing the number of redunandant nodes. Thank you for your efforts!
Thanks @chrijo, and thanks, yes, i'm aware of OSMnx. The diff with dodgr
is that it really is intended for massive routing, so the from
and to
points can be long vectors, and it will return a full matrix of distances. When I solve these bugs, I'll paste modified version of your code above here to show what I mean. dodgr
also does network simplification via dodgr_contract_graph()
, and the routing is conducted on the simplified/contracted graph to speed things up. dodgr
is also much faster: it can do like 1,000-times-1,000 points (so 1 million calculations) in a few minutes (depending on network size).
Okay, here we go with a massive response. I've been mostly using dodgr
for enormous routing tasks for which i've been able to ignore the NA
values, yet did mean to examine this further. This issue has given the ideal motivation to do that, so thanks! The following code should now work:
library (dodgr) # latest version
library(ggmap)
library(magrittr)
mainstation <- geocode("Hauptbahnhof, Essen, Germany")
mall <- geocode("Limbeckerplatz 1, Essen, Germany")
address1 <- geocode("Brigittastrasse 25, Essen, Germany")
address2 <- geocode("Kronprinzenstrasse 6, Essen, Germany")
address3 <- geocode("Engelsbecke 24, Essen, Germany")
address4 <- geocode("Holunderweg 10, Essen, Germany")
xy <- rbind (mainstation, mall, address1, address2,
address3, address4) %>%
data.frame ()
essen <- dodgr_streetnet (pts = xy, expand = 0.2, quiet = FALSE)
graph <- weight_streetnet (essen, wt_profile = "foot")
dodgr_dists (graph, from = xy, to = xy)
All distances are there except for to and from the mall. The reason for this can be seen by zooming in on a smaller section:
library (sf)
# get street network just from a very small area
xybb <- rbind (c (7.00421, 51.457),
c (7.00737, 51.4585)) %>%
data.frame () %>%
dplyr::rename (lon = X1, lat = X2)
essen_zoom <- dodgr_streetnet (pts = xybb, expand = 0, quiet = FALSE)
graph <- weight_streetnet (essen_zoom, wt_profile = "foot")
graph <- graph [which (graph$component == 1), ]
Then use just 2 routing points: the one inside the mall, and nearest street junction just outside, which has the following OSM ID:
nd <- "1590256404"
verts <- dodgr_vertices (graph)
i <- which (verts$id == nd)
xy2 <- rbind (c (7.005994, 51.45774), # the mall
as.numeric (verts [i, 2:3])) %>%
data.frame () %>%
dplyr::rename (lon = X1, lat = X2)
This can be viewed with mapview
to examine where the routing points are:
# convert pts to sf:
xy2sf <- function (xy)
{
res <- xy %>%
dplyr::select (lon, lat) %>%
as.matrix () %>%
split (., seq (nrow (.))) %>%
lapply (sf::st_point) %>%
sf::st_sfc () %>%
sf::st_sf ()
sf::st_crs (res) <- 4326
return (res)
}
library (mapview)
library (sf)
g_sf <- dodgr_to_sf (graph)
verts <- dodgr_vertices (graph)
xy_match <- verts [match_pts_to_graph (verts, xy2), c ("x", "y")] %>%
dplyr::rename (lon = x, lat = y) %>%
data.frame ()
m <- mapview (essen_zoom) %>%
addFeatures (xy2sf (xy2), color = "red", radius = 4) %>%
addFeatures (xy2sf (xy_match), color = "green", radius = 4)
print (m)
Hovering over those points reveals that they are on the OSM elements with IDs of #200178285 (the rotunda) and 200178377 (the internal passage). These do not, however, meet, as can be seen by filtering the above code to plot only those elements:
indx <- which (essen_zoom$osm_id %in% c ("200178285", "200178377"))
m <- mapview (essen_zoom [indx, ]) %>%
addFeatures (xy2sf (xy2), color = "red", radius = 4) %>%
addFeatures (xy2sf (xy_match), color = "green", radius = 4)
print (m)
Inspecting those ways on OSM reveals that the first one is UG (level = -1). The routing point has been matched there simply because it's the closest (x, y)
, but it can't actually be reached. (The other relevant ways are rotunda, more rotunda, the internal way with the routing point (on EG), the internal way on 1OG, and one on UG.)
The solution in this case is simply to filter out the ways that are not on the ground level. This is a bit fiddly because this seems to work:
levs <- paste0 (essen$level) # sf data cols are factors
indx <- which (!levs %in% c ("-1", "1"))
graph <- weight_streetnet (essen [indx, ], wt_profile = "foot")
d <- dodgr_dists (graph, from = xy, to = xy)
while this more generic approach does not:
indx <- which (!(is.na (levs) | grepl ("0", levs)))
graph <- weight_streetnet (essen [indx, ], wt_profile = "foot")
d <- dodgr_dists (graph, from = xy, to = xy)
The exact kind of level-filtering will depend on local data. (And if you run into any problems, you might need to re-generate the street network after having loaded sf
.) Moreover, note that this neither can nor should be done automatically, because it would then automatically disable any routing through any level != 0
. It is neverthelss obviously very important to highlight this, which I will do by extending the
examples of dodgr_dists
.
Finally, compare these to the google
ones
library (ggmap)
dg <- d
dg [1, 2] <- dg [2, 1] <-
mapdist(from = c("Hauptbahnhof, Essen, Germany"),
to = c("Limbeckerplatz 1, Essen, Germany"),
mode = "walking")$m
dg [1, 3] <- dg [3, 1] <-
mapdist(from = c("Hauptbahnhof, Essen, Germany"),
to = c("Brigittastrasse 25, Essen, Germany"),
mode = "walking")$m
dg [1, 4] <- dg [4, 1] <-
mapdist(from = c("Hauptbahnhof, Essen, Germany"),
to = c("Kronprinzenstrasse 6, Essen, Germany"),
mode = "walking")$m
dg [1, 5] <- dg [5, 1] <-
mapdist(from = c("Hauptbahnhof, Essen, Germany"),
to = c("Engelsbecke 24, Essen, Germany"),
mode = "walking")$m
dg [1, 6] <- dg [6, 1] <-
mapdist(from = c("Hauptbahnhof, Essen, Germany"),
to = c("Holunderweg 10, Essen, Germany"),
mode = "walking")$m
dg [2, 3] <- dg [3, 2] <-
mapdist(from = c("Limbeckerplatz 1, Essen, Germany"),
to = c("Brigittastrasse 25, Essen, Germany"),
mode = "walking")$m
dg [2, 4] <- dg [4, 2] <-
mapdist(from = c("Limbeckerplatz 1, Essen, Germany"),
to = c("Kronprinzenstrasse 6, Essen, Germany"),
mode = "walking")$m
dg [2, 5] <- dg [5, 2] <-
mapdist(from = c("Limbeckerplatz 1, Essen, Germany"),
to = c("Engelsbecke 24, Essen, Germany"),
mode = "walking")$m
dg [2, 6] <- dg [6, 2] <-
mapdist(from = c("Limbeckerplatz 1, Essen, Germany"),
to = c("Holunderweg 10, Essen, Germany"),
mode = "walking")$m
dg [3, 4] <- dg [4, 3] <-
mapdist(from = c("Brigittastrasse 25, Essen, Germany"),
to = c("Kronprinzenstrasse 6, Essen, Germany"),
mode = "walking")$m
dg [3, 5] <- dg [5, 3] <-
mapdist(from = c("Brigittastrasse 25, Essen, Germany"),
to = c("Engelsbecke 24, Essen, Germany"),
mode = "walking")$m
dg [3, 6] <- dg [6, 3] <-
mapdist(from = c("Brigittastrasse 25, Essen, Germany"),
to = c("Holunderweg 10, Essen, Germany"),
mode = "walking")$m
dg [4, 5] <- dg [5, 4] <-
mapdist(from = c("Kronprinzenstrasse 6, Essen, Germany"),
to = c("Engelsbecke 24, Essen, Germany"),
mode = "walking")$m
dg [4, 6] <- dg [6, 4] <-
mapdist(from = c("Kronprinzenstrasse 6, Essen, Germany"),
to = c("Holunderweg 10, Essen, Germany"),
mode = "walking")$m
dg [5, 6] <- dg [6, 5] <-
mapdist(from = c("Engelsbecke 24, Essen, Germany"),
to = c("Holunderweg 10, Essen, Germany"),
mode = "walking")$m
dg <- dg / 1000
plot (as.vector (d), as.vector (dg),
xlab = "dodgr", ylab = "google")
lines (0:10, 0:10, lty = 2)
The two are almost perfectly correlated, yet dodgr
gives generally much shorter distances. I've not examined what the google
router is doing there, but since most of the central area of Essen is pedestrianised, shorter distances can generally assume to be simply better here, I'd say. (And coincidentally, I actually live in Muenster, so know the whole area we're talking about here quite well, which made this a bit more fun than it would have been.)
This also demonstrates what I mean in the previous comment about dodgr
and large routing tasks - constructing that matrix from the google API is a pain in the butt. I'll leave this issue open until the example code has been updated.
Dear Mark, thank you very much for your detailed reply. It looks like I really have to get into osm to understand the pros and cons of the data. The differences in distances of Google API and dodgr is intressting. I didn't looked into your source code or Google API, however when matching of coordinates and neareast node is done, do you add this distance to the calculated path-distance?
nope, that's not added, but is generally only a couple of metres. (Adding would be somewhat dependent on the context in which dodgr_dists
was called, so would be slightly non-trivial, and not really advantageous. The large discrepancy between dodgr
and google
makes this kind of difference seem even less important.)
A short summary for you:
method | pro | con |
---|---|---|
will always give a result | total lack of flexibility, including no ability to route within, for example, different levels of a shopping mall | |
dodgr | may fail if input data not adequately prepared | much more flexible |
Concerning distances: dodgrdist() calculates about 637m from Kronprinzenstraße 6 to Brigittastraße 25 (4 -> 3), however linear distance is about 900 m.
library(dodgr)
library(ggmap)
address1 <- geocode("Brigittastrasse 25, Essen, Germany")
address2 <- geocode("Kronprinzenstrasse 6, Essen, Germany")
xy <- rbind (address1, address2) %>% data.frame ()
essen <- dodgr_streetnet (pts = xy, expand = 0.2, quiet = FALSE)
graph <- weight_streetnet (essen, wt_profile = "foot")
dodgr_dists (graph, from = xy, to = xy)
That gives abut 530m one way, 560m the other, and
haversine <- function (xy1, xy2)
{
xd <- (xy2 [1] - xy1 [1]) * pi / 180
yd <- (xy2 [2] - xy1 [2]) * pi / 180
d <- sin (yd / 2) * sin (yd / 2) +
cos (xy2 [2] * pi / 180) *
cos (xy1 [2] * pi / 180) * sin (xd / 2) * sin (xd / 2)
as.numeric (2 * 3671 * asin (sqrt (d)))
}
haversine (address1, address2)
gives a straight line dist of 410m. Everything seems okay.
Thanks! I tried to reproduce your example, but I got totally different results. However, I upgrade R to the latest version and install the latest dev-version of dodgr. Finally I got the same results . However, if I calculate the linear distance with library geospehere or sp it is about 714 m.
library(geosphere)
distm(x = xy[1,],y = xy[2,],distHaversine)
library(sp)
spDistsN1(as.matrix(xy),as.matrix(xy[2,]), longlat = T)
okay, now I really owe you a debt of gratitude for persisting with this - the previous version had the Earth's radius mistakenly entered as 3671
(and the above code was copied straight from there!), when it should have been 6371
- one small typo with huge implications! I've now updated it to the (overly-)precise value used in geosphere
, with the following results:
```{r}
library(dodgr)
library(ggmap)
library(magrittr)
address1 <- geocode("Brigittastrasse 25, Essen, Germany")
address2 <- geocode("Kronprinzenstrasse 6, Essen, Germany")
xy <- rbind (address1, address2) %>% data.frame ()
essen <- dodgr_streetnet (pts = xy, expand = 0.2, quiet = FALSE)
graph <- weight_streetnet (essen, wt_profile = "foot")
dodgr_dists (graph, from = xy, to = xy)
That now gives around 930m in one direction; 980 in the other, compared with
geosphere::distm (xy, fun = distHaversine)
giving 714m. The previous differences between dodgr
and google
distances were also obviously influenced by this typo. Updated version:
And the dodgr
distances are now slightly longer than the google ones (by around 10%), which is the kind of result that should arise if the pedestrian routing model for dodgr
actually chooses nicer but slightly longer routes. google really does just do a pretty dumb default route in most cases, as can be seen by the fact that it gives exactly the same result regardless of which direction you travel: Both of the following give 863m:
mapdist(from = c("Brigittastrasse 25, Essen, Germany"),
to = c("Kronprinzenstrasse 6, Essen, Germany"),
mode = "walking")$m
mapdist(from = c("Kronprinzenstrasse 6, Essen, Germany"),
to = c("Brigittastrasse 25, Essen, Germany"),
mode = "walking")$m
The (now finally correct!) dodgr
equivalents are the aforementioned 930 and 980m, because direction should and does matter. This in turn means that dodgr
should give generally slightly higher values than google, which is what the graph reveals.
See the above commit for my expression of gratitude to you. I'll have to rush out a new version asap to fix this!
@RobinLovelace check this: dodgr_dists
has been wrong all along coz I typed "3671" instead of "6371" for the Earth's radius! This won't have affected any of our work, and has no effect on dodgr_flows
, but has to chalk up as one of the most consequential typos of my coding career. Discovered thanks to @chrijo
Yeah, It's been a bit awkward for me to insist so much ;) Many thanks for developing this very help full package!!!
It was only your insistence and persistence that revealed my stupid mistake, so this is definitely a case where it's been supremely helpful. Thanks!
Great example of the importance of open and transparent development workflow. I'd just like to echo @chrijo's gratitude for flagging it and to Mark for acting on it!
To reiterate: this whole issue has been "supremely helpful"!
Hallo,
I am not sure what the exact unit (kilometre, miles?) of the output of dodgr_dists() is and if it's working correctly. I tried to verify the results from dodgr_dists() with googlemaps and there are hugh differences I can't explain. Please, try the following code to reproduce my Problem.