Before this PR, bulk lower-bound calculations in the approx-SP forest involved spinning through each approx-SP tree the forest contained, making the bulk lower-bound call on the tree, and then collecting and reducing the results.
Even though bulk heuristic is now only called once per pathfinding run, its runtime will become an increasing problem (and chunk of overall pathfinding runtime) as:
tighter upper and lower bounds are generated and used, leading to fewer nodes queued and expanded;
as more landmarks are added to each connected component (which also tightens lower bounds, causing a double whammy).
Obvious solution is obvious - transform the sequence of N method calls into one such call, leaning on numpy to manipulate the unified distances array in bulk.
Pulling all the distance manipulation into one spot surfaced a couple more tweaks.
Tweak the first - fastpath: Memoise which rows of the distance matrix had finite values for a given target node.
If any such rows start infinite for a given target node, they stay infinite (by how the edge-update gubbins work), and the corresponding rows don't have to be retrieved from self._distances.
Conversely, if no such rows are infinite, explicitly retrieve all rows, obviating the need to slice self._distances by row.
Tweak the second - speedypath : If on fastpath and all nodes are active, we would be implicitly retrieving the entire self._distances array by slicing it. Explicitly retrieve the whole thing instead.
Timing changes were within profiling noise, so I can't claim any speedup from the status quo.
As at the status quo, Zhodani heartland at route reuse=10 had 809s overall time, 666s pathfinding.
With the fastpath tweaks landed, same setup had 696s overall time, 613s pathfinding.
A 7.9% speedup isn't massive, but it isn't tiny, either, and is nothing to sneeze at.
Before this PR, bulk lower-bound calculations in the approx-SP forest involved spinning through each approx-SP tree the forest contained, making the bulk lower-bound call on the tree, and then collecting and reducing the results.
Even though bulk heuristic is now only called once per pathfinding run, its runtime will become an increasing problem (and chunk of overall pathfinding runtime) as:
Obvious solution is obvious - transform the sequence of N method calls into one such call, leaning on numpy to manipulate the unified distances array in bulk.
Pulling all the distance manipulation into one spot surfaced a couple more tweaks. Tweak the first - fastpath: Memoise which rows of the distance matrix had finite values for a given target node. If any such rows start infinite for a given target node, they stay infinite (by how the edge-update gubbins work), and the corresponding rows don't have to be retrieved from self._distances. Conversely, if no such rows are infinite, explicitly retrieve all rows, obviating the need to slice self._distances by row. Tweak the second - speedypath : If on fastpath and all nodes are active, we would be implicitly retrieving the entire self._distances array by slicing it. Explicitly retrieve the whole thing instead.
Timing changes were within profiling noise, so I can't claim any speedup from the status quo.As at the status quo, Zhodani heartland at route reuse=10 had 809s overall time, 666s pathfinding. With the fastpath tweaks landed, same setup had 696s overall time, 613s pathfinding. A 7.9% speedup isn't massive, but it isn't tiny, either, and is nothing to sneeze at.