Benefiting from tightened lower bounds in A* pathfinding only requires an admissible heuristic - one that never over-estimates cost to the target from some current node.
Benefiting from tightened upper bounds, however, requires that the heuristic be consistent, not merely admissible. A consistent heuristic is an admissible heuristic such that, over all nodes K, for a given node N and each successor P of N, the heuristic value h(N) is not greater than the sum of the cost of the edge from N to P, c(N, P), and the heuristic value at P, h(P). In other words:
c(N, P) >= h(N) - h(P)
As at the status quo, the bulk heuristic was the case-wise maximum of the following subheuristics:
approx-shortest-path
min-edge-cost
historic path distances
The first two subheuristics are themselves consistent, but the historic distances can only be consistent after the bitter end of pathfinding. If one subheuristic is not consistent, the resulting bulk heuristic is not consistent.
Enter again @GamesByDavidE , whose advice and feedback proved invaluable in diagnosing the non-consistent parts of the bulk heuristic. I followed his suggestion to replace the historic distances subheuristic with plain node-to-target distances in parsec.
I found out the hard way that this sort of exogenous upper bound requires a consistent heuristic to avoid having some nontrivial fraction (~15% in my tests) of pathfinding attempts fail to find a route. Each failed attempt found a route when it wasn't supplied with an upper bound value.
With that sorted out, onto the meat of this PR. My motivating observation was that when the first finite upper bound lands during pathfinding, the queue size gets slashed dramatically, with later nodes having their viable (ie, not bound busting) neighbour sets similarly trimmed - sometimes to exhaustion (ie, no viable neighbours).
For example, over the Zhodani Consulate heartland (Zdiedeiant, Stiatlchepr, Zhdant, Tienspevnekr):
Iefla (Zdiedeiant 0610) to Fievr (Stiatlchepr 2826),
54 pc straight line distance
First endogenous upbound found after 400 nodes expanded
280 nodes in queue as at first upbound
277 nodes dropped from queue because they bust the new upbound
Adliev (Zdiedeiant 1227) to Fievr (Stiatlchepr 2826)
48 pc straight line distance
First endogenous upbound found after 368 nodes expanded
368 nodes in queue as at first upbound
360 nodes dropped from queue because they bust the new upbound
Question The First - What happens if such an upper bound is supplied at the start of pathfinding?
Question The Second - How do we generate that upper bound?
As edge, and thus route, costs can only stay the same (if no involved edge is re-used in the meantime) or decrease (if at least one edge is re-used) in later pathfinding runs, the route cost when a route is found is an upper bound on the cost to travel that route at any later time.
I've salvaged some of the book-keeping from the skip-routing attempts and added skip links to the galaxy object's stars graph (not the DistanceGraph actually used for pathfinding, as tests showed the added links slowed down pathfinding).
When pathfinding from S to T, the route cost is obviously unknown, as the shortest path hasn't been found.
That doesn't preclude the following three cases:
A historic-route cost exists from at least one of S's immediate neighbours, S_n, to T.
A historic-route cost exists from at least one of T's immediate neighbours, T_n, to S.
At least one intermediate node, I, is connected by historic routes to both S and T.
In 1's case (and 2's, mutatis mutandis) the lower bound is the sum of C(S, S_n) and the historic-route cost from S_n to T. In 3's case, the upper bound is the sum of the two connecting historic routes. If a given case has multiple candidates, return the one with minimal upper bound. If multiple cases return finite upper bounds, return the lowest.
I've called that pre-heating the upper bound.
Especially for historic routes found early in pathfinding, there's no guarantee these upper bounds will be anything approaching firm, let alone tight.
Re-heating the upper bound takes the set of historic routes that are involved in the winning candidates across all three cases (so, from 1 to 4 historic routes, depending on which case(s) returned finite bounds), re-evaluates their historic-route costs as at the current pathfinding state, and updates the stored costs for the affected historic routes.
Percentage changes are from the master value for that item, over the Zhodani Consulate heartland (Zdiedeiant, Stiatlchepr, Zhdant, and Tienspevnekr - 2,231 worlds, 108,240 routes).
As for the bound checks:
The first check (Neighbour-set bound checks) checked each neighbour's g value against the lower of the upper bound and the existing distance label for that node.
The second check (Un-exhausted set checks with finite bound) checked each remaining neighbour's f value (ie, including the heuristic) against the upper bound, if finite.
Item
Master
Pre-heat
% drop
Re-heat
% drop
Nodes popped from queue
11,075,222
10,764,518
2.8
10,764,504
2.8
Nodes revisited
44,160
44,180
-0.05
44,190
-0.07
Nodes re-expanded
433
404
6.7
404
6.7
Neighbour-set bound checks
10,923,248
10,612,505
2.8
10,612,482
2.8
Neighbour-set checks with finite bound
1,368,726
10,608,317
-
10,608,294
-
% of neighbour-set checks w/finite bound
12.53
99.96
-
99.96
-
Exhausted nodes
2,084,714
3,157,675
-51.5
3,166,334
-51.8
Un-exhausted nodes
8,838,534
7,454,830
15.66
7,446,148
15.75
Un-exhausted set checks with finite bound
973,633
7,451,571
-665.34
7,442,889
-664.45
% of un-exhausted set checks w/finite bound
11.02
99.96
-
99.96
-
New upper bounds during pathfinding
236,230
113,791
51.83
113,400
52.00
Nodes actually having neighbours to queue
8,594,714
5,508,333
35.91
5,492,012
36.10
Nodes added to queue
67,584,700
12,291,594
81.81
12,190,946
81.96
Final neighbourhood size per node
7.87
2.23
71.66
2.22
71.79
Sans profiling, over the same sectors:
Master: 714 s overall, 636 s pathfinding
Re-heat: 657 s overall, 570s pathfinding
A 10.3% reduction in pathfinding time's nothing to sneeze at.
Benefiting from tightened lower bounds in A* pathfinding only requires an admissible heuristic - one that never over-estimates cost to the target from some current node.
Benefiting from tightened upper bounds, however, requires that the heuristic be consistent, not merely admissible. A consistent heuristic is an admissible heuristic such that, over all nodes K, for a given node N and each successor P of N, the heuristic value h(N) is not greater than the sum of the cost of the edge from N to P, c(N, P), and the heuristic value at P, h(P). In other words: c(N, P) >= h(N) - h(P)
As at the status quo, the bulk heuristic was the case-wise maximum of the following subheuristics:
The first two subheuristics are themselves consistent, but the historic distances can only be consistent after the bitter end of pathfinding. If one subheuristic is not consistent, the resulting bulk heuristic is not consistent.
Enter again @GamesByDavidE , whose advice and feedback proved invaluable in diagnosing the non-consistent parts of the bulk heuristic. I followed his suggestion to replace the historic distances subheuristic with plain node-to-target distances in parsec.
I found out the hard way that this sort of exogenous upper bound requires a consistent heuristic to avoid having some nontrivial fraction (~15% in my tests) of pathfinding attempts fail to find a route. Each failed attempt found a route when it wasn't supplied with an upper bound value.
With that sorted out, onto the meat of this PR. My motivating observation was that when the first finite upper bound lands during pathfinding, the queue size gets slashed dramatically, with later nodes having their viable (ie, not bound busting) neighbour sets similarly trimmed - sometimes to exhaustion (ie, no viable neighbours).
For example, over the Zhodani Consulate heartland (Zdiedeiant, Stiatlchepr, Zhdant, Tienspevnekr):
Iefla (Zdiedeiant 0610) to Fievr (Stiatlchepr 2826),
Adliev (Zdiedeiant 1227) to Fievr (Stiatlchepr 2826)
Question The First - What happens if such an upper bound is supplied at the start of pathfinding?
Question The Second - How do we generate that upper bound?
As edge, and thus route, costs can only stay the same (if no involved edge is re-used in the meantime) or decrease (if at least one edge is re-used) in later pathfinding runs, the route cost when a route is found is an upper bound on the cost to travel that route at any later time. I've salvaged some of the book-keeping from the skip-routing attempts and added skip links to the galaxy object's stars graph (not the DistanceGraph actually used for pathfinding, as tests showed the added links slowed down pathfinding).
When pathfinding from S to T, the route cost is obviously unknown, as the shortest path hasn't been found. That doesn't preclude the following three cases:
In 1's case (and 2's, mutatis mutandis) the lower bound is the sum of C(S, S_n) and the historic-route cost from S_n to T. In 3's case, the upper bound is the sum of the two connecting historic routes. If a given case has multiple candidates, return the one with minimal upper bound. If multiple cases return finite upper bounds, return the lowest.
I've called that pre-heating the upper bound.
Especially for historic routes found early in pathfinding, there's no guarantee these upper bounds will be anything approaching firm, let alone tight.
Re-heating the upper bound takes the set of historic routes that are involved in the winning candidates across all three cases (so, from 1 to 4 historic routes, depending on which case(s) returned finite bounds), re-evaluates their historic-route costs as at the current pathfinding state, and updates the stored costs for the affected historic routes.
Percentage changes are from the master value for that item, over the Zhodani Consulate heartland (Zdiedeiant, Stiatlchepr, Zhdant, and Tienspevnekr - 2,231 worlds, 108,240 routes).
As for the bound checks: The first check (Neighbour-set bound checks) checked each neighbour's g value against the lower of the upper bound and the existing distance label for that node. The second check (Un-exhausted set checks with finite bound) checked each remaining neighbour's f value (ie, including the heuristic) against the upper bound, if finite.
Sans profiling, over the same sectors: Master: 714 s overall, 636 s pathfinding Re-heat: 657 s overall, 570s pathfinding A 10.3% reduction in pathfinding time's nothing to sneeze at.