States "near" a solution route are often not being processed

Zannick commented 1 month ago

When we recreate a route, since we need the "next" states to be complete, we insert into the queue/state dbs every state that follows each step in the route. We should find these states again, as they might be better than other states that we are preferring despite having worse time estimates.

It might be too tricky to figure out "distance" from the nearest state in a solution, so perhaps the right method is to take the states in a solution and advance them.

Zannick commented 1 month ago

I started thinking of designs for partitioning a queue by set of visits. It's relatively trivial to take the visit bits and turn them into a Vec of bytes which can be used as a unique key, but the main question of the design is how do we organize a queue?

Right now, our queue is a bucket queue where each bucket is a double priority queue, which allows for independent pop and evict per bucket. It might continue to be reasonable to keep visits grouped by number but allow dividing up the buckets further by unique key. So then each bucket in the top-level bucket queue would then be a KeyedBucketQueue that we'd have to design, and each bucket in that queue would be a double priority queue so that we can evict from them. The question is what else do we need in order to accomplish the fairness we're looking for, and importantly, do we need to update the db schema to accommodate?

Or perhaps we make a separate queue just for improving solution routes. Or perhaps we add a prioritization penalty when we start having a lot of entries for a given unique key.

Requirements

Access all buckets without having the keys.
A general thread should access all buckets basically evenly.
When the queue is too large, we can evict from buckets.
When the number of buckets is too large, we can evict entire arbitrary buckets.
Eviction prefers states further away from their initiator (i.e. the states that have gone much further without visiting a new location).

Nice-to-haves

States can be retrieved from the db in a similar way (there must be a way to determine we'd like to try them again).

Zannick commented 1 month ago

"Eviction prefers states further away from their initiator" is an interesting idea on its own--we could change the number used in the priority queues (the score) to be independent of the number of visits, since truly it already is.

Using the time since last visit would equalize states with the same number of visits, and still allow prioritizing better routes to the next location. If we score based on that plus estimated time left, it will likely prioritize more roundabout routes that did eventually make a visit farther away (and interestingly, it will result in the higher progress levels having lower scores). Maybe we'd want to use a tuple score to include a tiebreaker, such as (time since last visit, total estimated time).

Either way, changing the score wouldn't require as much work, so I'll focus on trying this first. We will need to add a field for time since last visit to ContextWrapper and statedb, make sure the score field in the queuedb key is set correctly, and make sure the cap is applied correctly to total estimated time and not score.

Zannick / logic-graph

States "near" a solution route are often not being processed #164