Closed mpadge closed 2 years ago
I think this would be useful for sure, great idea.
Another great use case: Take OD data and say you want to examine bicycle flows between all pairs. For each pair, this function would enable you to calculate the average length between all points in O and all in D that actually travels along dedicated bicycle paths (or whatever). So your OD flows would be able to be directly related to infrastructure connecting each pair.
Steps to implement this:
DGraphDualEdge
class which inherits from current DGraphEdge
, but has an additional <bool>
flag vector
https://github.com/ATFutures/dodgr/blob/f59b049b23f0919f88ad0a466057dee991db4cbc/src/dgraph.h#L23-L28DGraph::addNewEdge
to addNewDualEdge
which also includes the flag
vectorPathFinder::scan_edges
as scan_dual_edges
, and include vectors of <bool> flag
and <double> d_flagged
.run_sp
, including OneDist
-> OneDistDual
, and rcpp_get_sp_dists
as rcpp_get_sp_dists_dual
.rcpp_get_sp_dists_dual
to R and write a wrapper for that function interface.Definitely sounds useful, only thought is what if you wanted more than a binary breakdown, for example, % of the journey on a 20 mph road, 30 mph road, 40 mph road, etc
Yeah, @mem48, I had a think about that too - being able to submit a single factor column, and getting a breakdown for all specified factors. But the mechanics have to be propagated to the innermost depths of the underlying C++ Dijkstra algorithms, so even just doing it for binary is pretty non-trivial. Recorded here for posterity nonetheless, so thanks!
This is now almost fully implemented in a branch (#166). That will be merged when the TODO list has been completed - Done! Leaving vignette should be about as the only major remaining task.
Edit:
to
and directly calculate proportions from each from
point along all paths out to specified thresholdproportions_only
argument to return individual proportions for each from
value in that case.@mem48 @Robinlovelace This function now works. It's called dodgr_dists_categorical
, with a few types of interface implemented. The input graphs need to have an edge_type
column with some number of discrete values (integers, characters, whatever). Distances are then aggregated along each of these categories, and final results given for each component. There's also a dlimit
parameter which aggregates distances along all categories from a defined set of "from"
points. The following code demonstrates that:
library (dodgr)
packageVersion ("dodgr")
#> [1] '0.2.10.11'
f <- "/<path>/<to>/leeds-sc.Rds"
graph <- readRDS (f) %>%
weight_streetnet (wt_profile = "bicycle") %>%
dodgr_contract_graph ()
#> Loading required namespace: geodist
#> Loading required namespace: dplyr
# Some processing of "highways" into a restricted set of `edge_type` categories:
graph$edge_type <- graph$highway
graph$edge_type [grep ("^(primary|trunk)", graph$edge_type)] <- "primary"
graph$edge_type [grep ("^secondary", graph$edge_type)] <- "secondary"
graph$edge_type [grep ("^tertiary", graph$edge_type)] <- "tertiary"
graph$edge_type [graph$edge_type == "path"] <- "pedestrian"
graph$edge_type [graph$edge_type == "living_street"] <- "residential"
not_these_types <- c ("service", "footway", "unclassified", "track",
"steps", "bridleway")
graph$edge_type [graph$edge_type %in% not_these_types] <- NA_character_
sort (table (graph$edge_type), decreasing = TRUE)
#>
#> residential tertiary pedestrian primary cycleway secondary
#> 113813 23543 18702 16689 9979 5231
v <- dodgr_vertices (graph)
from <- v$id
system.time (
d <- dodgr_dists_categorical (graph, from = from, dlimit = 2000)
)
#> user system elapsed
#> 2106.168 3.884 341.442
for (n in names (d) [-1])
d [[n]] <- d [[n]] / d$distance
d$x <- v$x [match (rownames (d), v$id)]
d$y <- v$y [match (rownames (d), v$id)]
Created on 2021-10-04 by the reprex package (v2.0.1.9000)
All of Leeds takes just over 5 minutes, and the results look like this:
Those maps provide an immediate overview of locations of relatively good and bad cycling infrastructure. Each point represents the proportion of bicycle-routed trips along all possible paths outward from that point that traverse the given kinds of ways out to an arbitrary threshold of 2km. Increasing that to a larger distance (like, say 5km) generates smoother results, but simply takes longer to calculate.
Hi Mark, first impressions from that and the maps covering a huge area: seriously impressive stuff!
That's partly why this has taken so long - i knew if was possible to do this extremely efficiently, but required completely designing the path finding algorithms at the lowest level. I only recently found time to do that, and am very happy with the result. Every path from every OSM street junction in Leeds in 5 minutes.
Follow-up question on reproducibility, what was the command used to generate leeds-sc.Rds
? Will give it a spin in a free moment...
Full reprex for my hometown of Hereford:
remotes::install_github("atfutures/dodgr")
library (dodgr)
packageVersion ("dodgr")
sc = dodgr::dodgr_streetnet("hereford UK")
graph = sc %>%
weight_streetnet (wt_profile = "bicycle") %>%
dodgr_contract_graph ()
graph$edge_type <- graph$highway
graph$edge_type [grep ("^(primary|trunk)", graph$edge_type)] <- "primary"
graph$edge_type [grep ("^secondary", graph$edge_type)] <- "secondary"
graph$edge_type [grep ("^tertiary", graph$edge_type)] <- "tertiary"
graph$edge_type [graph$edge_type == "path"] <- "pedestrian"
graph$edge_type [graph$edge_type == "living_street"] <- "residential"
not_these_types <- c ("service", "footway", "unclassified", "track",
"steps", "bridleway")
graph$edge_type [graph$edge_type %in% not_these_types] <- NA_character_
sort (table (graph$edge_type), decreasing = TRUE)
#>
#> residential tertiary pedestrian primary cycleway secondary
#> 113813 23543 18702 16689 9979 5231
v <- dodgr_vertices (graph)
from <- v$id
system.time (
d <- dodgr_dists_categorical (graph, from = from, dlimit = 2000)
)
#> user system elapsed
#> 2106.168 3.884 341.442
for (n in names (d) [-1])
d [[n]] <- d [[n]] / d$distance
d$x <- v$x [match (rownames (d), v$id)]
d$y <- v$y [match (rownames (d), v$id)]
head(d)
dsf = dodgr_to_sf(d)
dsf2 = sf::st_as_sf(d, coords = c("x", "y"), crs = 4326)
Sweet! If you know precisely what you want, it'll generally be quicker just to keep that edge type (say, "cycleway"
), and set all others to NA
, which are then ignored and not aggregated. All will be documented asap in vignette, which will close this issue.
As Robin said, this looks great. Can't wait to try it out.
Thanks @mem48, it was essentially all motivated by your incidental comment:
only thought is what if you wanted more than a binary breakdown, for example, % of the journey on a 20 mph road, 30 mph road, 40 mph road, etc
That got me thinking ... for a long time ... that such an approach must indeed be possible. Turns out that it is, so thanks!
I'll have to be careful what I say in the future, I might accidentally send you off on another long project.
New vignette for you reading pleasure gentlemen! :book: :eyes: Thanks both for the input :+1:
Develop a new function modified from
dodgr_distances
to accept one additional parameter specifying a binary "flag" variable. Use this to aggregate two distances, rather than just one in currentdodgr_distances
, with the second variable only aggregated along edges with the flag variable set toTRUE
.Use cases:
Generate a sample of journeys between random points in a given area/city, and calculate things like:
The latter would provide more insight into equalstreetnames (with more cities via github repo linked at top of that page). While those websites are really cool, it's also very obvious that streets named after women tend to be both peripheral and minor. To work out the exposure of actual people traversing a city to female versus male streets requires a realistic routing of random (or all) journeys through a city. That in turn could be achieved with this functionality.
ping @Robinlovelace @mem48 Any thoughts on the usefulness of this functionality for your work? Should be pretty easy to implement, if you think it would help.