makhidkarun / traveller_pyroute

Traveller trade route generator
MIT License
14 stars 5 forks source link

Reuse found distances #42

Closed CyberiaResurrection closed 1 year ago

CyberiaResurrection commented 1 year ago

When pathfinding finds a route, it stores its length in parsecs in the galaxy's ranges graph.

That length serves as a (usually) tighter lower bound on the cost of the path than the basic Star.heuristicDistance call, in turn marginally speeding up pathfinding by providing a tighter heuristic value for the distance between the path endpoints..

On my usual 12 sector testbed (4,117 stars, 41,845 routes), pathfinding took 14 min 30 s (or 870 s) before these changes, and now takes 10 min 19 s (or 619 s). I won't say no to a 28% speedup that used data already lying around.

Second stage of this work maintains an approximate shortest path tree anchored at a landmark. Initially, that tree maintains the exact shortest-path distances from the landmark to all nodes in the same component. The basic idea is realising that those distances from landmark to all nodes in same component combine with the triangle inequality to form an (often better than default) lower bound on the distance from nodes A to B. Namely, given nodes A and B, and distances from landmark L of dist (L, A) and dist (L, B): dist (A, B) >= abs (dist(L, A) - dist(L, B))

The approximation comes in by allowing those bounds some slack - currently, up to 20% over the actual values. Given same distances above, and maxbound = max(dist(L, A), dist(L, B)) and minbound = min(dist(L, A), dist(L, B)), the bound becomes: dist(A,B) >= max(0, maxbound / (1 + 0.2) - minbound)

When the absolute difference between nodes A and B exceeds (1 + 0.2) * the actual weight of the graph edge from A to B, the shortest-path tree is restarted from nodes A and B (rather than mindlessly restarting from the landmark each time, which hurts efficiency) and the distance values are updated.

Heuristic calculation has been tweaked to use the largest value among whatever is available when the heuristic call lands. If that's just the distance bound, spit that back. If a route-distance value is available that is greater than the distance bound, spit back the route-distance. Likewise for the approximate-triangle bound, and all three bounds.

The second stage further dropped pathfinding times on that 12-sector testbed to 8 min 48 s (528s) - an extra 14.7% on top of reusing existing route lengths, and a 39.3% time saving against the status quo.

Further microtweaking on that 12 sector testbed dropped pathfinding time to 6 min 23 s (383 s). That's 27.4% on top of approximate shortest-path trees, 38.1% on top of existing-route reuse, and 55.9% less time than the status quo.

For what it's worth, the 32 imperial sectors completed pathfinding after microtweaking in 4 h 39 min 3 s (16,743 s).

Thanks to @GamesByDavidE for design discussions that ultimately made (once I cottoned on) this second part of the PR work.

tjoneslo commented 1 year ago

I have a good understanding of what is going on here, both intended and implemented. Which is good. But still several questions.

In constructing the components list you would end up with component 0 with 95-98% of the systems, and a number of smaller components with just a few systems each. Would it make any difference if you ignored these smaller components?

The solution to find a landmark system is to select an important world in the component. Would it help, hurt, or have no difference if the landmark was geographically centered? e.g. select Reference (Core 0140) as your landmark for component 0 for your test bed case.

I have a concern the construction and update of the shortest_tree_path when being run on the full charted space data set (~60K worlds) will not improve the performance of the system but degrade it. Due to the O(N^2) to O(N^3) nature of the tree path routes.

CyberiaResurrection commented 1 year ago

@tjoneslo , thanks for your feedback and I'm happy that I've helped you resolve your earlier questions.

Giant component

Your giant-component question (having 95%+ of the systems) was actually how I originally implemented the approximate shortest-path stuff. What I seemed to gain on the swings I more than lost on the roundabouts, having to worry (when getting lower bounds, inside heuristic generation) whether the stars in question were in that component or not.

Cutting over to an approximate-shortest-path forest allowed me to not care about whether stars were in the giant component or not, and as a nice bonus, speed up pathfinding in smaller components (such as, frinstance, the Islands cluster, being a separate component to the giant connected component of Charted Space).

So, to directly answer your first question, ignoring smaller components with my current implementation actually hurts performance and adds more complexity.

Landmark location/choice

I haven't really tested differing choose-first-landmark schemes - the biggest-WTN star in the component was guided more by the route finding than any geometry. https://www.cs.princeton.edu/courses/archive/spr06/cos423/Handouts/GH05.pdf explains the basic ALT approach that I've adapted to the dynamic-costs case.

From what I think I remember seeing, ALT-style approaches tend to like landmarks spread out on or near the convex hull of the component. Multiple-landmarks will probably be in another PR, following the furthest-neighbour approach I outlined earlier. As a result, I'm not sure it's worth spending much skull sweat on first-landmark location. As timing results on this PR shows, the big step is from 0 to 1 landmarks in a component. Other calculation classes are free to implement their own landmark-selection strategy.

Approx-SP performance overhead

Thirdly, your concern about the performance overhead of constructing and maintaining the approx-SP forest over the full orchestra.

With default settings on trade, the biggest initial link weight I could find was 10082 on a J-7 link, which is an excess of 10075 over its (7-pc in this case) distance. That excess is multiplied by 0.9 each time the link is used in a route, and when that excess drops below 1.4 (0.2 * distance), touching that link will no longer trip approx-SP updates.

By my maths, 85 such updates will drop that pathological edge into no-update territory. Any link with a smaller initial excess will require no more than 85 updates to similarly drop into no-update territory.

On another branch, I did instrument the number of times each edge gets touched.

Over the 32 implist sectors (I think), a total of 145,528 edges got instrumented at max J-4, for a total of 19,761,445 updates. 124,554 (85.6%) edges were never touched. 5,190 (3.6%) edges were only touched once. 1,776 (1.2%) edges were touched twice. 90.4% of edges were touched no more than twice, thus never directly tripping an approx-SP update.

Up the other end: 5,454 (3.7%) of edges were touched 86+ times, for a total of 19,579,237 updates (99.1% of total updates). 19,115,647 of those updates (96.7% of total updates) were after their respective edge had dropped into no-update territory.

By subtraction, only 637,066 (3.2% of total updates) could have tripped an approx-SP update. For simplicity of analysis, I'm ignoring the elephant in the room of when one edge trips an update, it's very rarely alone.

A J-6 pathological link has a weight of 490 + 83 = 573, for an excess of 567, with 59 touches dropping it into no-update territory. That puts at least 152,712 more updates (approx a quarter of the original could-trip-updates total, discounting edges in the penumbra between the J-7 and J-6 update thresholds) into no-trip-update territory, and if anything, strengthens my argument. If only 3.2% of total updates (J-7 threshold) are not a major cause for concern, then ~2.4% of total updates (J-6 threshold) aren't, either.

From what little data I have over a 4-sector testbed (Core, Fornast, Lishun, Antares), approx 20% of routes tripped an approx-SP update, with an average of 2.8 seeds (ie, nodes to restart the implicit-dijkstra search from) passed in each time.

tjoneslo commented 1 year ago

By my maths, 85 such updates will drop that pathological edge into no-update territory. Any link with a smaller initial excess will require no more than 85 updates to similarly drop into no-update territory.

A question for the next round of fixes then. Would it make sense to use the route use counts (an integer comparison) to limit the number update rather than multiple floating point calculations?

CyberiaResurrection commented 1 year ago

By my maths, 85 such updates will drop that pathological edge into no-update territory. Any link with a smaller initial excess will require no more than 85 updates to similarly drop into no-update territory.

A question for the next round of fixes then. Would it make sense to use the route use counts (an integer comparison) to limit the number update rather than multiple floating point calculations?

Prima facie, it would seem that update_edges() in ApproximateShortestPathTree would be the spot to do so, at the price of explicitly calculating and storing an update bound for each edge in the ruddy graph. Then, if edge's current update count exceeds its update bound, it doesn't directly trip a restart.

I can see two possible snags to this. 1 - An ASPT instance with epsilon = 0 will always trip updates. (I did this initially to wring out the implementation) 2 - If multiple ASPT instances are present (ie, multiple landmarks per component), that update-count bound would have to be calculated on the lowest positive epsilon value among the instances present. For example, say two ASPT instances are present, the first with epsilon = 0.2, the second (for sake of argument) with epsilon = 0.01. The pathological links mentioned above have update bounds of 85 and 59 touches respectively for epsilon = 0.2, and 113 and 87 respectively for epsilon = 0.01.

The no-update gubbins currently is an implicit consequence of the existing, and well-tested, route update mechanics, and will change organically as parms change - frinstance, that J-7 pathological link with route-reuse of 50 will no-update after 440 updates.