Project-OSRM / osrm-backend

Open Source Routing Machine - C++ backend
http://map.project-osrm.org
BSD 2-Clause "Simplified" License
6.29k stars 3.32k forks source link

Enhancing route alternative(s) #2883

Closed kodonnell closed 4 weeks ago

kodonnell commented 8 years ago

Some suggestions - apologies if I've said something silly, but I've had a fair look through the issues/doc/Google but I'm not too much wiser (hence the first point).

In other words, it'd be good to give the option to a (well-informed, hence the doco) user to do potentially slow/risky things, if they wish (like asking for a hundred alternatives).

danpat commented 8 years ago

@kodonnell The title of the paper Dennis linked to was "Candidate Sets for Alternative Routes in Road Networks" by himself and Dennis Schieferdecker.

Extended abstract here: https://www.aaai.org/ocs/index.php/SOCS/SOCS13/paper/viewFile/7220/6261

Presentation here: https://algo2.iti.kit.edu/download/s-csarrn-12.pdf

As implemented in OSRM at the moment, the code will only return at most 1 alternative, sometimes 0. There are no parameters to tune to change that. The constants you referenced will only affect the probability of finding a suitable single alternative, they will not cause it to return more than one.

For a slightly more up-to-date overview on alternative route finding, the introduction section of @MoKob 's thesis at http://algo2.iti.kit.edu/documents/Dissertation_Kobitzsch_Moritz.pdf covers a lot of ground very well.

Finding hundreds of alternatives will require a specific algorithm implementation which OSRM does not currently have. Finding good alternative routes is a hard problem.

I think this is something very desirable, so we should keep this ticket open as a placeholder until someone tackles an implementation :-)

kodonnell commented 8 years ago

Thanks for the info @danpat - an interesting read. Admittedly it'd take me a lot longer to fully digest it, but from what I saw, there was nothing which limited it to single alternatives. It'd be interesting to here from @DennisOSRM or @DennisSchiefer (or yourself?) on the reasons for it only returning one match - and how hard it'd be to tweak to return multiple (including possible suggestions).

Hundreds of alternatives may have been an exaggeration, but my use case will help give context. We're looking at cellular data which by nature includes very sparse points (e.g. separated by hours and hundreds of kilometres), which can be very 'fuzzy' (including knowing sometimes only the general region of the device to within a few kms). We're interested in finding many alternate routes so that we can build up a fair representation of device movement (at an aggregate level only), recognising that in the real world people don't always choose the fastest route. (To be clear, this is a map matching problem, for now.) It is arguably quite unique, though with the advent of more IoT devices and wifi/beacon technology, I suspect it may arise more often.

danpat commented 7 years ago

To be honest, I'm not all that familiar with our alternative route finding, I've only got a passing understanding of what it does.

Our implementation creates a set of potential via points, then checks them to see if they're any good:

https://github.com/Project-OSRM/osrm-backend/blob/master/include/engine/routing_algorithms/alternative_path.hpp#L309-L323

We only select the first point that passes the test, but I don't see a reason we couldn't look at more and potentially return more routes. @MoKob can speak a lot better to how good the rest of the alternatives would likely be.

MoKob commented 7 years ago

@kodonnell the reason is conceptually very simple. Alternative routes are just the exact opposite of what speed-up techniques are doing. If you would consider the perfect technique to finding a shortest path, it would be only looking at exactly the shortest path. A Contraction Hierarchy is definitely not perfect, but it comes darn close. If you look at routes over an entire continent, a CH will only investigate a few hundred additional intersections in addition to the shortest path itself.

The big problem now is, if you only find very few paths, how can you find a good alternative. The paths you will find for a route from a small town to the town right next to it will also include (if you don't stop right away) paths crossing through towns hundreds of kilometers away. E.g. if you want to go from Washington DC to New York, you might also find a path crossing through San Francisco. That is ok for us, though, since we can stop searches based on the best distance found so far and the entire set of nodes we can look at in the worst case is very small (a few hundreds, maybe up to the lower thousands).

If you want to find a good via-alternative route, which is defined as finding a path from s->v->t for an intermediate v, we need to find a good v first. For this, theoretically millions of vertices are possible choices. To find a good one is a hard problem. For more information, I really recommend reading the literature @danpat mentioned here.

kodonnell commented 7 years ago

@MoKob, I understand, I think. However, the question is (as @danpat has reiterated) - if you've already got a method for finding alternatives, why only return the "best" one? Is there any systematic reason why the second alternative is not valid?

Personally, I'd look at returning all alternatives which meet a certain criteria (which may be "just give me the top N"), as well as allowing the user to adjust what "acceptable" is (e.g. the parameters I referred to above.) From what @danpat has said, it seems like the algorithmic machinery is already in place to achieve this.

(I am reading the aforementioned literature, but there's quite a lot!)

danpat commented 7 years ago

Personally, I'd look at returning all alternatives which meet a certain criteria

That right there is the rub. That's basically what we do, but the results still aren't great. Here's roughly what OSRM does right now:

  1. Find the shortest route.
  2. Continue the Dijkstra search for 10% (VIAPATH_ALPHA) past the length of the first meeting point.
  3. Each time the forward/reverse searches meet, store the node as a "candidate node"
  4. Use a heuristic to sort the candidates by "most likely to give us an alternative first". It's just a heuristic, it's not perfect.
  5. Iterate over the sorted candidate node list, test if the route from A->candidate->B meets our critera. Return that route if we do.

The not-so-obvious problem here is that many of the viapoints are useless. They find only trivially different paths from the shortest, or nearby viapoints return very similar paths. The size of the list of candidates depends on the shape of the CH, and how far apart your A and B points are. We often go all the way through the list without finding a viable alternative. This is partly where @MoKob 's description of a CH being orthogonal to alternative route finding - the CH means that we don't get a huge number of candidates.

I had a good long voice conversation with @MoKob about this a couple of days ago. With the approach OSRM uses, we've got about a 60% chance of finding 1 route that meets our criteria. If we continued testing viapoints, the chance drops to 30% for a second alternative, then 10% chance for a third. Very much diminishing returns. We could have the algorithm return the alternative route for each via, but you'd find that they're mostly the same, or don't fit the "different enough from the fastest route" criteria.

The "find alternatives using vias" is a very active research area - @MoKob 's dissertation linked above (see section 5) outlines the current state-of-the-art for various approaches.

kodonnell commented 7 years ago

So, summarising:

Is that about right?

How easy would it be to allow the user to choose the algorithm (in addition to any parameters)? That would mean OSRM could meet more use cases, as well of saving us the pain of having to choose the "best" one - including what we do when we want to introduce a new "best" algorithm (i.e. change control). A pluggable codebase (with built-in metrics for comparing algorithms!!) would also be great for R&D and community.

MoKob commented 7 years ago

Well. Given the new v5 api, adding more routes wouldn't be difficult. The main problem that a parameter resembling anything like max_alternatives=5 would also result in many people to expect some alternatives. Our current implementation will offer an alternative route in only very few cases, a second in even fewer. So any configuration in that regard would only give a wrong impression of our capabilities.

Choosing an algorithm, however, would be a totally different topic. Algorithms to compute alternative routes can only be implemented efficiently if the data is organised in a way that supports their computation model. Simply having a framework to evaluate a series of algorithms to compute alternative routes is not an easy task.

We are currently working with a lot of internal tools to improve the quality of the osrm routes we find. It might be that we can open source them at some point. For osrm itself, I don't see any immediate plans to integrate swapping capabilities for algorithms. The interconnectivity between the different parts is just to high to allow easily swapping out the core of the algorithm.

emiltin commented 7 years ago

does that mean that mapbox directions is not running plain osrm, but somehow uses additional tools to improve routing? or are the tools more related to improving osm data?

kodonnell commented 7 years ago

@MoKob - in a perfect world, I'd be able to write a pluggable alternate route finder, in addition to a pluggable data store (in whatever format is best for my algorithm). Agreed that's it's non-trivial!

danpat commented 7 years ago

@kodonnell Well, OSRM already has plugins :-) We don't have pluggable datastores at the moment though, so basically the only graph you have to work with is the current CH that we create. This limits what algorithms you can implement in any extra plugins you write.

danpat commented 7 years ago

@emiltin The tools are more related to improving the data. We also extensively use the traffic data loading features for some private endpoints, and those are highly data-driven.

github-actions[bot] commented 2 months ago

This issue seems to be stale. It will be closed in 30 days if no further activity occurs.