opentripplanner / OpenTripPlanner

An open source multi-modal trip planner
http://www.opentripplanner.org
Other
2.19k stars 1.03k forks source link

Bidirectional A* #461

Closed abyrd closed 13 years ago

abyrd commented 13 years ago

Search space for non-transit A* is about twice as big as it would need to be, because we only search in one direction. Non-time-dependent trips (walk or bike only) could search both forward and backward, and halt when search spaces meet. This is particularly relevant when we allow edge weights to be unpredictable via user input (eg the bike triangle).

novalis commented 13 years ago

Do you mind if I just go ahead and implement this?

abyrd commented 13 years ago

Sure, go ahead.

novalis commented 13 years ago

Bidirectional A* requires storing separate sets of nodes that have already been visited in order to find the intersection between the two searches. Unfortunately, this slows down the search enough that bidirectional is actually slower than a plain search.

karussell commented 12 years ago

Did you investigate why this is the case? Normally RAM usage should decrease and it should also get faster. Is this algo somewhere in the code? I could investigate ...

abyrd commented 12 years ago

When doing a bidirectional search we have to detect when the two searches meet, using a closed node set. As implemented, we don't keep a set of closed nodes during the search. Via the SPT object we know which nodes have been reached by the search (i.e. which nodes have associated states), but not whether they have reached their final optimal state. The speedup from bidirectional search is roughly constant-factor (1/2 the search space), and the slowdown from building and checking a separate closed node sets is also constant-factor. Apparently, according to novalis's benchmark the latter outweighed the former -- building and checking that set made each iteration more than twice as slow on average.

That said, if you are interested in this particular problem do feel free to experiment!

karussell commented 12 years ago

What do you mean with 'closed' node set? I'll hopefully take some time and have a look into this! Is there some benchmarking skeleton code for this algorithm on a common data set?

And then I read on the wiki that the CH code is disfunctional - is this still true?

abyrd commented 12 years ago

By 'closed' I mean a node has been pulled from the queue and all its outgoing edges followed. In Dijkstra's original algorithm, this means that the final, lowest cost of reaching the node has been found. This is not always the case depending on the algorithm in use.

The CH code does run and pass tests, but we haven't been using it as much recently so it is not really well maintained. We noticed that it is slower than it used to be (relative to non-CH searches), and were thinking that maybe bugs had crept in. However, since other parts of OTP have been optimized since the CH code was written, it is also possible that CH is simply no longer as big a win as it used to be.

The other problem is that CH assumes static edge weights. Flexible user-configured weighting of safety etc. are not really compatible with this.

karussell commented 12 years ago

This is not always the case depending on the algorithm in use.

ok, because its multi modal. it is a pity as bidirectional dijkstra is about 60% faster and consumes less memory. Do you have some papers/links for this multi modal case?

When doing a bidirectional search we have to detect when the two searches meet, using a closed node set.

Have a look here, where you have a different/more correct "exit strategy":

http://www.cs.princeton.edu/courses/archive/spr06/cos423/Handouts/EPP%20shortest%20path%20algorithms.pdf

also for certain cases it is possible to use OpenBitSet (apache lucene) to reduce space consumption but only if node ids are index-alike. which I would prefer to avoid the use of hashes and use arraylists.

abyrd commented 12 years ago

On 01/30/2012 09:33 PM, Peter wrote:

This is not always the case depending on the algorithm in use.

ok, because its multi modal. it is a pity as bidirectional dijkstra is about 60% faster and consumes less memory. Do you have some papers/links for this multi modal case?

The biggest problem is that our graph is time-dependent, and we don't know at what time we should begin the reverse search. This is why we were originally going to apply the bidirectional A* only to walk/bike trips.

Our bibliography of routing articles is at: https://github.com/openplans/OpenTripPlanner/wiki/RoutingBibliography

In the multiobjective path search (which may become the main search algorithm after some more work) what we now do is run a different kind of backward search to find a lower bound on path weight at every vertex. Each edge returns a time-independent lower bound on the weight of traversal, with transit boarding edges returning 0 for example. This search is much faster than the usual time-dependent forward search, and defines an accurate A* heuristic to guide the search forward.

see BidirectionRemainingWeightHeuristic.java.

I tried stopping that search before it touched the entire graph, but it didn't provide an impressive speedup. I think the best opportunity for speeding up that reverse search is bundling all transit edges on a single route together, as well as all superimposed turn edges on a given street, for the purposes of the reverse search. My branch with the refactored vertex and edge hierarchies started as a step in this direction.

When doing a bidirectional search we have to detect when the two searches meet, using a closed node set.

Have a look here, where you have a different/more correct "exit strategy":

http://www.cs.princeton.edu/courses/archive/spr06/cos423/Handouts/EPP%20shortest%20path%20algorithms.pdf

also for certain cases it is possible to use OpenBitSet (apache lucene) to reduce space consumption but only if node ids are index-alike. which I would prefer to avoid the use of hashes and use arraylists.

In case you want to experiment with this -- every vertex already has an index, which is used as its hashcode. These are unique positive integers "close to zero" such that they can be used as an array or bitset index without wasting space. We have tried using this index to track vertices and states, and it can provide a speedup over hash tables (anecdotally 10%) that is arguably not worth the increase in code complexity and the allocation of array/set slots for every vertex, when only a small fraction of the graph might be involved in a search.

-Andrew