Closed Zannick closed 1 week ago
I started thinking of designs for partitioning a queue by set of visits. It's relatively trivial to take the visit bits and turn them into a Vec of bytes which can be used as a unique key, but the main question of the design is how do we organize a queue?
Right now, our queue is a bucket queue where each bucket is a double priority queue, which allows for independent pop and evict per bucket. It might continue to be reasonable to keep visits grouped by number but allow dividing up the buckets further by unique key. So then each bucket in the top-level bucket queue would then be a KeyedBucketQueue that we'd have to design, and each bucket in that queue would be a double priority queue so that we can evict from them. The question is what else do we need in order to accomplish the fairness we're looking for, and importantly, do we need to update the db schema to accommodate?
Or perhaps we make a separate queue just for improving solution routes. Or perhaps we add a prioritization penalty when we start having a lot of entries for a given unique key.
Requirements
Nice-to-haves
"Eviction prefers states further away from their initiator" is an interesting idea on its own--we could change the number used in the priority queues (the score) to be independent of the number of visits, since truly it already is.
Using the time since last visit would equalize states with the same number of visits, and still allow prioritizing better routes to the next location. If we score based on that plus estimated time left, it will likely prioritize more roundabout routes that did eventually make a visit farther away (and interestingly, it will result in the higher progress levels having lower scores). Maybe we'd want to use a tuple score to include a tiebreaker, such as (time since last visit, total estimated time).
Either way, changing the score wouldn't require as much work, so I'll focus on trying this first. We will need to add a field for time since last visit to ContextWrapper and statedb, make sure the score field in the queuedb key is set correctly, and make sure the cap is applied correctly to total estimated time and not score.
When we recreate a route, since we need the "next" states to be complete, we insert into the queue/state dbs every state that follows each step in the route. We should find these states again, as they might be better than other states that we are preferring despite having worse time estimates.
It might be too tricky to figure out "distance" from the nearest state in a solution, so perhaps the right method is to take the states in a solution and advance them.