UrbanAnalyst / dodgr

Distances on Directed Graphs in R
https://urbananalyst.github.io/dodgr/
127 stars 16 forks source link

Benchmarks #212

Closed Robinlovelace closed 3 months ago

Robinlovelace commented 1 year ago

It could be good to do some reproducible benchmarks.

Opening this issue to track the idea and related code...

mpadge commented 1 year ago

I had a go at the code you included, and did some comparative benchmarking of my own. The problem is there really is no comparison bewteen what the two algorithms do. dodgr is designed and intended to calculate routing on dual-weighted graphs, so the routing is calculated on one set of weights - the profile, or routing preferences - while the actual distances are calculated using another set. cppRouting only calculates routes on a single set of weights, so is a considerably simpler algorithm, and is therefore correspondingly more efficient (between 3 - 5 times on my machine). But this comparison is not fair, because dodgr is doing a lot more work.

More importantly, routing with a single set of weights is rarely useful in any real-world scenarios. Either routes can't represent actual preferences and so are unrealistic, or else distances and/or times are unrealistic. Further complications arise in dodgr's treatment of compound intersections, which cppRouting also can not handle. Realistic road networks have all sorts of complicated penalties and restrictions on turns across intersections and oncoming traffic. dodgr captures all of these, including the potential need for cyclists to use multiple traffic-light phases where a car might be able to "simply turn". These calculations replace single junctions, as would be represented in cppRouting, with commonly between 5-10 different representations of the same junction, each in turn refelecting one possible way to enter and exit. That also means that the underlying C++ representation in dodgr includes many more intersections than cppRouting, and so dodgr is slower.

Plus there's the issue of time penalties, which dodgr also tracks. Time-based routing is not simply a re-scaled version of distance-based routing, because time-based has to include waiting time penalties and other stuff, none of which can be represented or calculated with cppRouting.


I'm not opposed to the idea in general here, but if it is to be done, it would need to highlight at least the effects of all of these kinds of differences, which would require a lot of work. Happy to see where this goes ....

Robinlovelace commented 1 year ago

I think cppRouting may be able to handle dual weighted graphs based on this:

It is now possible to set an auxiliary set of edge weights during graph construction in makegraph() function and set aggregate_aux to TRUE in getdistance* functions.

Source: https://github.com/vlarmet/cppRouting#work-with-dual-weighted-network

mpadge commented 1 year ago

Update

Adding dual weights slows cppRouting down to only around 40% faster than dodgr on my machine. Even then, it's still making quite a few simplifying assumptions that dodgr doesn't make. Most important is that graphs are not really directed, so that the time A -> B is always presumed equal to B -> A. That makes calculations much simpler, but is again not userful for accurate routing, like with hills, or directional speed limits. It also simplifies everything down to geometries, whereas dodgr always defaults to OSM id values if they exist. Geometries-only generally provides poor pedestrian routing, among other things, where infrastructure can often only be separated by IDs. Think pedestrian overpasses. Given that extra heavy work that dodgr is doing, I'm not unsatisfied with 40% or so difference.

That said, there are two notable things that cppRouting does that dodgr does not yet could. The first are far better contraction heirarchies. That likely explains a fair bit of the performance boost. Those are, however, tricky, because dodgr is kinda built with my use-cast in mind, which is not just many-to-many, but really every-to-every. In that case, the approach to creating hierarchies in cppRouting would lose most of its advantage. Those hierarchiies always presume routing only ever needs some sub-sete of points. Submitting all points to those routines would then remove all advantage. But some hybrid of the two may well be worth thinking about ... ?

The other aspect is the flow aggregation / vehible allocation routines. Those are really good, and so well implemented that it's probably not really worth trying to copy them here. I might just start using those routines myself.

Whatever happens, I'm happy to keep this issue open for a while to explore a bit further. Thanks for the impetus!

Robinlovelace commented 1 year ago

Thanks Mark glad to hear your take on it and to hear that cppRouting is worth exploring!

FlxPo commented 4 months ago

Hi Mark, how would you use the flow agregation routines from cppRouting, based on an existing dodgr graph ? Can I just pass the from / to / time columns to cppRouting's makegraph ?

mpadge commented 4 months ago

@FlxPo Yep, that would work. The results will be different - and maybe very different - from dodgr results, depending on what kind of network weighting you used. In short: cppRouting is fast; dodgr is accurate.

mpadge commented 3 months ago

Closing this; can re-open later if desired