Closed jonhoo closed 4 years ago
Erasing an element doesn't actually invalidate a RawIter
as long as the element you removes is "behind" the iterator. This should probably be documented better, but if you look at retain
you can see that is how it works.
Ah, I think maybe some context is needed here.
The iterator griddle keeps is an iterator over the "old" table of elements that have not yet been moved. We keep that iterator around even while other operations are performed (like removals) so that the next insert can immediately get the next item to move without having to scan for non-empty buckets. There are no elements left "before" the iterator, as they have all been moved.
The issue here is that the user calls remove
on the top-level map, which then ends up removing an upcoming element from the cached iterator (potentially even in its current group). At the very least, the items
counter in the RawIter
needs to be decremented, but I also experienced that the cached iterator would yield empty buckets for the items that have been removed since when it was created. Basically, this code:
https://github.com/jonhoo/griddle/blob/91b0f5b1b08c87834d7e36b3aa8230a46fbeaddc/src/lib.rs#L783-L787
Triggered this assertion: https://github.com/rust-lang/hashbrown/blob/efbd03661fc9df0e594ecc89784107353cc71060/src/raw/mod.rs#L491
I think a "general" refresh is probably too tricky to add (e.g., you wouldn't know how to change items
without re-scanning), but maybe a notify_remove_ahead
method that decrements by 1
and refreshes the current iterator group?
When a remove happens during the resize period, we need to restart the
RawIter
we have over the old table to track which elements have yet to be removed. This restart happens here: https://github.com/jonhoo/griddle/blob/b2a063c9efb82907447d92dad88015314379402c/src/lib.rs#L865-L871Unfortunately, when the iterator is restarted this way, it has to walk all the empty buckets of elements we have already moved the next time we try to move a key. When the old map is large, this can take a significant amount of time.
It would be better if we could "refresh" the iterator instead. This should be possible, since we know the old table has neither shrunk nor grown. I probably requires some changes internally in
hashbrown
. I pinged thehashbrown
author about it in https://github.com/rust-lang/hashbrown/pull/166#issuecomment-651446241.