makhidkarun / traveller_pyroute

Traveller trade route generator
MIT License
14 stars 5 forks source link

Scale # of landmarks with component size #103

Closed CyberiaResurrection closed 7 months ago

CyberiaResurrection commented 7 months ago

Having all components have the exact same number of landmarks is infeasible - if (say) all components have 9 landmarks, what happens for individual components with fewer than 9 stars?

This PR scales number of landmarks, L, as a function of size, N: L = min ( 15, ceil (3 * log10( N ) )

Single-star components have zero landmarks, but are never hit in pathfinding, so they're not a problem. 10-star components have 3 landmarks, 100-star landmarks have 6 landmarks, 1,000-star landmarks have 9 landmarks, 10,000-star components have 12 landmarks and 100,000-and-up-star components have the maximal 15 landmarks.

The first six landmarks are the extremes along the q, r and s axes (extending the status quo). The seventh landmark keys off an observation I tripped over - all else equal, having the pathfinding target be a landmark speeds up pathfinding compared to a non-landmark. Where both source and target stars are not landmarks, or are both landmarks, no problem, status quo prevails. Likewise where the target star is a landmark. Where the source is a landmark and the target isn't, collect the speed gain by transposing the two before that pathfinding run stars. Source-target transpose has been good for 1-3% fewer nodes expanded during testing.

Eighth and later landmarks are generated using the "avoid" algorithm, as follows:

The avoid algorithm seeks out gaps in bound coverage, and iteratively fills them in.

On top of that, I collected a further speedup by ensuring every pathfinding attempt starts with a finite upper bound (extending upper bound preheading), and dropping the has_bound checks in astar_numpy.

Further testing showed: Some wombat was selecting duplicate landmarks; It was trivial to sort component stars by descending WTN before picking landmarks; The ceiling on # of landmarks has to be dialled back markedly with higher route-reuse values; Aforementioned wombat was collecting enough data that it was worth automating under debug flag;

Status quo: Item Reuse 5 Reuse 10 Reuse 20 Reuse 140 Reuse 1000
Nodes popped 9,634,668 9,376,398 8,884,714 9,901,048 13,698,150
Nodes revisited 35,351 38,599 50,177 86,502 269,339
Neighbour-set bound checks 9,491,071 9,229,569 8,726,407 9,706,577 13,321,459
g-exhausted nodes 2,985,722 2,712,781 2,505,205 2,844,931 2,472,773
Un-g-exhausted nodes 6,505,349 6,516,788 6,221,202 6,861,646 10,848,686
f-exhausted nodes 1,490,446 1,617,406 1,531,324 1,609,014 4,007,017
Un-f-exhausted nodes 5,014,903 4,899,382 4,689,878 5,252,632 6,841,669
New upper bounds 111,281 113,208 115,926 115,788 146,694
Nodes actually queueing 4,914,034 4,797,657 4,586,744 5,148,087 6,727,274
Neighbours queued 10,199,442 10,567,719 10,437,109 16,513,909 29,567,669
Final neighbourhood size per node 2.08 2.21 2.31 3.21 4.53
With this PR, after changes to epsilon selection: Item Reuse 5 Reuse 10 Reuse 20 Reuse 140 Reuse 1000
Nodes popped 5,312,722 5,282,146 4,706,482 4,796,980 6,733,488
Nodes revisited 10,863 10,355 15,049 29,493 95,679
Neighbour-set bound checks 5,193,612 5,163,544 4,583,186 4,659,240 6,529,562
g-exhausted nodes 1,173,131 1,010,401 850,807 981,660 1,073,027
Un-g-exhausted nodes 4,020,481 4,153,143 3,732,379 3,667,580 5,456,535
f-exhausted nodes 964,032 1,098,538 933,345 905,176 2,204,279
Un-f-exhausted nodes 3,056,449 3,054,605 2,799,034 2,762,404 3,252,256
New upper bounds 109,961 111,290 112,852 120,775 128,019
Nodes actually queueing 2,951,596 2,950,446 2,695,342 2,665,909 3,142,909
Neighbours queued 5,712,005 5,921,670 5,428,872 7,149,460 11,067,783
Final neighbourhood size per node 1.94 2.01 2.02 2.66 3.53
Pathfinding time (s) 468 493 539 653 836
And the percentage drops: Item Reuse 5 Reuse 10 Reuse 20 Reuse 140 Reuse 1000
Nodes popped 44.85% 43.66% 50.75% 51.55% 50.84%
Un-g-exhausted nodes 38.19% 36.27% 40.00% 46.54% 49.70%
Un-f-exhausted nodes 39.05% 37.65% 40.31% 59.62% 52.46%
Nodes actually queueing 39.93% 38.50% 41.23% 48.21% 53.28%
Neighbours queued 43.99% 43.96% 47.98% 56.70% 62.56%
With this PR, originally: Item Reuse 5 Reuse 10 Reuse 20 Reuse 140 Reuse 1000
Nodes popped 5,312,722 5,282,146 4,375,247 5,839,554 13,300,004
Nodes revisited 10,863 10,355 12,245 42,912 261,387
Neighbour-set bound checks 5,193,612 5,163,544 4,254,755 5,688,395 12,930,370
g-exhausted nodes 1,173,131 1,010,401 765,947 1,288,841 2,537,213
Un-g-exhausted nodes 4,020,481 4,153,143 3,488,808 4,399,554 10,393,517
f-exhausted nodes 964,032 1,098,538 841,978 1,118,483 3,681,824
Un-f-exhausted nodes 3,056,449 3,054,605 2,370,325 3,281,071 6,711,333
New upper bounds 109,961 111,290 112,852 123,204 144,615
Nodes actually queueing 2,951,596 2,950,446 2,540,952 3,173,650 6,596,417
Neighbours queued 5,712,005 5,921,670 5,047,606 9,589,578 29,156,973
Final neighbourhood size per node 1.94 2.01 1.99 3.21 4.43
Pathfinding time (s) 468 493 518 696 1446
And the percentage drops: Item Reuse 5 Reuse 10 Reuse 20 Reuse 140 Reuse 1000
Nodes popped 44.85% 43.66% 50.75% 41.02% 5.10%
Un-g-exhausted nodes 38.19% 36.27% 43.92% 35.88% 4.19%
Un-f-exhausted nodes 39.05% 37.65% 49.45% 37.53% 1.90%
Nodes actually queueing 39.93% 38.50% 44.60% 38.35% 1.94%
Neighbours queued 43.99% 43.96% 51.63% 41.93% 1.38%
CyberiaResurrection commented 7 months ago

@tjoneslo , I think the relative edge exhaustion speeds are the main culprit here.

Going back to our pathological J7 edge of initial weight 10082: Excess is thus 10075.

At reuse=10, epsilon is 0.1, so excess to exhaust that edge is 0.7 - a ratio of 14,392.6. Each time that edge gets used in a route, that excess is multiplied by (1 - epsilon), or 0.9 in this case. 91 updates exhausts that edge and precludes further updates from tripping approx-SP updates.

At reuse=1000, epsilon is 0.001, so excess to exhaust that edge is 0.007 - a ratio of 1,439,285.8. Each time that edge gets used, excess is multiplied by (1 - epsilon), or 0.999 in this case. 14,173 updates are needed to exhaust that edge with reuse=1000, vs 91 with reuse=10. Assuming epsilon of 0.1 while reuse stays at 1000, 9,570 updates are needed.

That's first part of the blowout - roughly 100x-150x more hits required to exhaust the edge. The second part - which seems to be very much by design, just how it interacts with exhaustion is the HMAS That's Very Annoying - as reuse parameter increases, edges are reused less.

Over ye olde 4 sector Zhodani heartland test: reuse=5 has first edge (on any route) exhaust after 291 routes. reuse=1000 has first edge (on any route) exhaust after 30,346 routes. Again, taking roughly 100x more routes to first exhausted edge (that testbed having 108,247 routes in total).