A* shortest path algorithm

WJSchakel commented 1 year ago

Introduction Recently I saw the YouTube video The hidden beauty of the A* algorithm which gives a wonderfully intuitive explanation of the A* algorithm. Simply put, suppose that the Dijkstra algorithm has two open nodes in its process:

Node "Rome", found over 800km, this is the destination.
Node "Berlin", found over 600km.

Dijkstra's algorithm will search further from Berlin, because 600km < 800km, so there may be a shorter route to Rome. But clearly there isn't, because Berlin is more than 200km away from Rome even in a straight line. So what A* does when it needs to choose the next link from a node, is to define the link cost as:

The length of the link
plus the euclidean distance from the end-node towards the destination node

With this approach, much less links need to be considered before the shortest path is found. The algorithm uses "node potential" to favor links towards nodes that make much more sense to prioritize in the algorithm, i.e. nodes that have lower potential (as a more physical term). Using the end-node distance towards the destination is a particular heuristic to determine this potential, but it makes a lot of sense. There is much more to this algorithm, as different heuristics can be used, and there are certain constraints to prevent negative weights and to guarantee that the path found is the shortest. The distance-to-destination heuristic meets these requirements.

The link weight can of course also be something other than distance. In that case a sensible heuristic is required. If the weight is travel time, the potential could be defined as: x_destination/v_max, where:

x_destination: euclidean distance from the end-node to the destination v_max: maximum speed that can be achieved (e.g. per GTU type, or for a specific vehicle)

Implementation We currently use LinkWeight to determine the weight of links in the Dijkstra algorithm. There is a static LENGTH and a static LENGTH_NO_CONNECTORS version. We can additionally define instances using a destination Node that report the weight as described above. There could also be a version that uses a Node and Speed for the travel time heuristic.

Just-in-time weight calculation The video mentioned above also shows, in simple code, the A* algorithm implemented. Two optimizations are mentioned:

The algorithm conceptually needs the difference of potential between end-node and start-node on a link. But, as the potential of the start-node of a link is equal for all outgoing links considered for some node, the potential of the start-node can be ignored. In the above description, it is already ignored.
It makes no sense to pre-calculate all the link weights. You only need to calculate them at the point the algorithm requests a link weight.

In OtsNetwork.buildGraph() all links are looped, and a weight for them is set, prior to the shortest path algorithm being invoked. Instead we could use an extension of the graph: JitWeightedGraph extends SimpleDirectedWeightedGraph<Node, Link>. This class would override the double getEdgeWeight(Link e) method, and use a LinkWeight to return the weight. Perhaps it should cache the calculated weights, I'm not sure whether any of the super class functionality already takes care of that.

Note that an extension of SimpleDirectedWeightedGraph is already used for the algorithm to determine lane change information in OtsRoadNetwork.

Current code optimization Removing LinkEdge Currently SimpleDirectedWeightedGraph<Node, LinkEdge<Link>> is being used, where LinkEdge wraps a Link and extends org.jgrapht.graph.DefaultWeightedEdge. But the second generics arguments may be anything. It seems that the default edge types are useful to use default edge suppliers, which are used for E Graph.addEgde(V sourceVertex, V targetVertex), but we use boolean Graph.addEdge(V sourceVertex, V targetVertex, E e). The use of LinkEdge in OTS is always with a call to getLink(). Hence there is no need to wrap a Link, LinkEdge can be removed, and we can use SimpleDirectedWeightedGraph<Node, Link>.

Algorithm code duplication We have three methods in OtsNetwork to obtain a shortest path: getShortestRouteBetween(gtuType, nodeFrom, nodeTo, linkWeight), getShortestRouteBetween(gtuType, nodeFrom, nodeTo, viaNodes), and getShortestRouteBetween(gtuType, nodeFrom, nodeTo, viaNodes, linkWeight) The second refers to the third using LinkWeight.LENGTH_NO_CONNECTORS. The first and second have a full implementation, which look highly similar. The first can refer to the third using an empty ArrayList<Node> for viaNodes. In the third method there is a line DijkstraShortestPath<Node, LinkEdge<Link>> dijkstra = new DijkstraShortestPath<>(graph); inside a for-loop. It appears this can done once before the for-loop, as the graph does not change when looping over de nodes in the route.

Graph cache The method OtsNetwork.getGraph() may retrieve a path from cache, but only for LinkWeight.LENGTH. The only method writing in the cache is OtsNetwork.buildGraph(GtuType), which uses LinkWeight.LENGTH_NO_CONNECTORS to build a graph. So not only does this not cache correctly, it also is unable to cache for other LinkWeights. But we cannot cache always. If link weights are dynamic or have a random component, the graph is invalid for a second use. We can add LinkWeight.isStaticWeight() to know whether we can cache the graph, and have a cache not only per GTU type, but also per (static) LinkWeight.

WJSchakel commented 1 year ago

I tested it with RouteWeightedGraph that is used to determine lane change info. The weights are cached nowhere. It is arguable whether it is worthwhile to cache the weights. The above calculations are not complex (e.g. instanceof Connector check, pythagoras, and an addition). For large networks, a look-up in a map may be slower. Caching in RouteWeightedGraph is more difficult, as it uses a Route that can be changed at any time.

WJSchakel commented 1 year ago

Using the end-node distance towards the destination is a particular heuristic to determine this potential, but it makes a lot of sense.

One small correction. This heuristic works if:

Link weights are their normal values.
You add the heuristic value to the regular distance so far towards a node, as the priority value to minimize in a priority queue from which the next node to consider is drawn.

As we cannot affect the algorithm at that level (as we use a tool for Dijkstra) we must affect the link weights themselves. To guarantee finding the shortest path, the heuristic to add to link weight should be: end-node potential minus start-node potential. This means that both potentials have to be calculated, not just one.

averbraeck commented 1 year ago

Note that jgrapht (which we use for Dijkstra) has an A* implementation, see https://jgrapht.org/javadoc-1.3.1/org/jgrapht/alg/shortestpath/AStarShortestPath.html. It's quite a flexible implementation.

WJSchakel commented 1 year ago

Wonderful! All we might need is a simple Pythagorean AStarAdmissibleHeuristic.

WJSchakel commented 1 year ago

Indeed. For a simple test network of 20x20 nodes, the code below makes a difference from ~25ms for Dijkstra, to ~4ms for A*.

    /**
     * Heuristic for the A* algorithm that uses Euclidean distance.
     */
    AStarAdmissibleHeuristic<Node> EUCLIDEAN_DISTANCE = new AStarAdmissibleHeuristic<>()
    {
        /** {@inheritDoc} */
        @Override
        public double getCostEstimate(final Node sourceVertex, final Node targetVertex)
        {
            return sourceVertex.getPoint().distanceSI(targetVertex.getPoint());
        }
    };

In getShortestRouteBetween():

GraphPath<Node, LinkEdge<Link>> path =
                linkWeight.getAStarHeuristic() == null ? DijkstraShortestPath.findPathBetween(graph, nodeFrom, nodeTo)
                        : new AStarShortestPath<>(graph, linkWeight.getAStarHeuristic()).getPath(nodeFrom, nodeTo);

WJSchakel commented 1 year ago

This shows an equivalent test in Matlab.

Astar

Legend: Red: covered by Dijkstra Yellow: covered by A* Green/blue: resulting routes from both algorithms

averbraeck commented 1 year ago

the code below makes a difference from ~25ms for Dijkstra, to ~4ms for A*.

That is a sizable difference! Does that include the calculation of the potentials?

WJSchakel commented 1 year ago

It does but... it also includes generation of a graph that was cached during the Dijkstra call, and re-used by the A* call. Creating the graph was the majority of the Dijkstra time.

A benchmark with 100 random networks gives A about a 20% decrease in run time. The potentials are a minute part, because letting the code calculate that 10 times (unnecessarily) does nothing about this performance gain. In my Matlab tests, A has about a 13% run time relative to Dijkstra. My guess is that the Dijkstra implementation from jgrapht is highly optimized, where A perhaps is less suitable for such optimization. In my Matlab code they run on the same algorithm, with an if-statement for 1 line of code that is different for Dijkstra and A.

But these algorithms are really quite fast either way, as setting up the test network is the majority of the running time of the benchmark code. But at least we now have A* available, which is useful for large real-world networks.

WJSchakel commented 1 year ago

AStarAdmissibleHeuristic<Node> EUCLIDEAN_DISTANCE was added to LinkWeight. This implements the Euclidian heuristic for A*.
LinkWeight ASTAR_LENGTH_NO_CONNECTORS was added to LinkWeight. This can be used with OtsNetwork.getShortestRouteBetween(...) to get a shortest path.
All implementations of OtsNetwork.getShortestRouteBetween(...) now use 1 implementation, which allows for via-nodes. The list of via-nodes may however be empty.
Removed LinkEdge as it only wrapped a Link. The graphs are now directly based on Link.

averbraeck / opentrafficsim

A* shortest path algorithm #39