Zannick commented 1 month ago

Here's a segment of the current best route in AV2:

* Collect The_Student from Irikar > Basement Pipes > High Pipe > Tablet
  Move... to Irikar > Beach > Cache
* Collect Power_Matrix from Irikar > Beach > Cache > Item
  MainSavewarp to Ebih > Ebih West > Lower Save
  FastTravelKiengirwarp to Menu > Kiengir Map > Ebih West Lower
  Move... to Amagi > West Lake > Stronghold Front Room
! Do Become Indra
  Move... to Amagi > West Lake > Stronghold Ceiling Left
* Collect Amagi_Stronghold_Boulder_1 from Amagi > West Lake > Stronghold Ceiling Left > Knock Down Left Boulder
  Move... to Amagi > West Lake > Stronghold Middle Column
  FastTravelKiengirwarp to Menu > Kiengir Map > Shockwave
  Move... to Irikar > Basement Pipes > Left Vertical Pipe
* Collect Power_Core from Irikar > Basement Pipes > Left Vertical Pipe > Health Pickup
! Do Become Drone
  FastTravelKiengirwarp to Menu > Kiengir Map > Irikar Basement Core

Some issues present in this route:

Items we don't need: the Amagi boulder, and the Power Core.
Warping away to get another item then warping back to collect an item that was nearby the first part of the route.

Observations as we use them now aren't really helping here, because these aren't exactly hitting the same spots along this segment. So there's no loop we can cut that avoids picking up the useless Power Core, for example.

In part we've been okay with such a route because we expect the search to pick up and make improvements. For example, we could skip the Power Core entirely and become drone, which should match observations. But that has a time estimate of 2972781 and progress level 108... after 250M additional states processed, the min value in the queue for that level was 2522438, with the db slightly lower. (And the earliest I have left in scrollback is 18M states prior, with a 2535147, so the min time actually went down!) Some of that could be from earlier inefficiencies being compounded into the time, but this illustrates a heavy drawback of our time estimate and queuing setup and how it's not very close to finding real improvements. This is sort of the point of #164.

If we want to mutate the route directly (#163), previously we skipped locations we didn't need and recreated the solution. This would still need a way to drop extra movement and/or actions in between so we don't wind up going halfway across the map to perform "Become Drone".

Some ideas for how to improve with observations:

Allow starting somewhere other than the same spot. That is, we could match observations with a state at spot A and the first step is "go to spot B", in which case we simply find the shortest route there. This is crucial for a lot of these other ideas as well.
Group the root level of observations at the Area level rather than Spot.
Reduce the values in the trie from a full list of the history to a list of the Locations (and maybe actions).
Restrict the observations to just items. This might be riskier for not having energy, other context variables. But the idea is "if we currently have these items, then we could try getting the rest in this order".
Find some way we could store smaller sections of routes, e.g. consecutive steps that stay in-region and collect multiple items?

Zannick commented 1 month ago

One further point on the progress level queue: part of the issues with estimates as they are is that they tend low as a matter of not being perfect, and we combine different sets of item collections into the same progress levels. As long as we visited 108 locations, we slot the state into level 108, and if we collected 108 in a more efficient way than actual solutions, then the remaining time estimate is more likely further away from actual, and thus we're more likely to prioritize states that have made worse long-term choices.

It would be very difficult to replace the whole progress-based bucket queue with one based on sets of locations visited, in part because the number of possibilities is exponential. Instead:

There might be a way to stratify progress scores further, essentially grading a state on more than just the number of required locations that it's visited. Likely this would have to be weighted based on locations, rather than taking history into account.
We track sets of items (i.e. observations) in a matcher trie that we can use to find more efficient route segments to get other items or sets of items (or all of the items, i.e. route to the finish).
We have some smaller cache containing the best unprocessed state for each combination that we can have a dedicated thread or two processing (with some prioritization metric that we can use to evict entries into the queue db if it gets too large).

Zannick commented 3 weeks ago

164 seems to have greatly improved this issue. I'm not entirely sure it's the best answer, but I'm happy with it for now.

Zannick / logic-graph

Route improvements needed #169

164 seems to have greatly improved this issue. I'm not entirely sure it's the best answer, but I'm happy with it for now.