Open Zannick opened 1 month ago
One further point on the progress level queue: part of the issues with estimates as they are is that they tend low as a matter of not being perfect, and we combine different sets of item collections into the same progress levels. As long as we visited 108 locations, we slot the state into level 108, and if we collected 108 in a more efficient way than actual solutions, then the remaining time estimate is more likely further away from actual, and thus we're more likely to prioritize states that have made worse long-term choices.
It would be very difficult to replace the whole progress-based bucket queue with one based on sets of locations visited, in part because the number of possibilities is exponential. Instead:
Here's a segment of the current best route in AV2:
Some issues present in this route:
Observations as we use them now aren't really helping here, because these aren't exactly hitting the same spots along this segment. So there's no loop we can cut that avoids picking up the useless Power Core, for example.
In part we've been okay with such a route because we expect the search to pick up and make improvements. For example, we could skip the Power Core entirely and become drone, which should match observations. But that has a time estimate of 2972781 and progress level 108... after 250M additional states processed, the min value in the queue for that level was 2522438, with the db slightly lower. (And the earliest I have left in scrollback is 18M states prior, with a 2535147, so the min time actually went down!) Some of that could be from earlier inefficiencies being compounded into the time, but this illustrates a heavy drawback of our time estimate and queuing setup and how it's not very close to finding real improvements. This is sort of the point of #164.
If we want to mutate the route directly (#163), previously we skipped locations we didn't need and recreated the solution. This would still need a way to drop extra movement and/or actions in between so we don't wind up going halfway across the map to perform "Become Drone".
Some ideas for how to improve with observations: