Closed novacrazy closed 5 months ago
Yes, that kind of API would be helpful, however, I wouldn't expect a huge performance gain, since there can always be other threads inserting values in that range, and thus disallowing me to implement something similar to split_off
.
-> Still, I can get rid of one tree traversal for each removal (comparing to for e in range { if (cond) { tree.remove(e.0); } }
, so the gain would be at around ~30%, I'd say.
I'll implement something next week.
For my use case at least the key is always atomically incrementing, I keep an Arc<AtomicU64>
around for that, so it never would re-insert below the range. Perhaps some sort of unsafe fn split_off
or drain
that assumes this? If not, I understand, this is a hefty ask.
Thank you again for your continued development and upkeep of scc
. It's become a staple library for me.
In my opinion, implementing a new data structure for the purpose is optimal since the ever-increasing key assumption will enable a lot of optimization; I think, a ring-buffer-like data structure will be better than a tree (didn't think about it seriously yet).
On the other hand, implementing fn split_off
seems to be a viable option, and maybe it doesn't have to be unsafe
, because fn insert
can check whether the inserted key has just been split off; though correctly implementing it will be super hard..
Anyway, I'll firstly focus on optimizing the for e in range { if (cond) { tree.remove(e.0); } }
case, and then think about any other possible long-term solutions (including a new data structure and split_off) afterwards.
fn remove
for Iter
and Range
.
=> It's relatively trivial; ~ 14 Jan.fn remove_range<R: RangeBounds<K>>(&self, range: R)
for bulk removal.
=> Slightly complicated, but it will be doable! Though I need to figure out possible outcomes when there are other threads inserting keys in the range, and document them.
=> ~ mid Feb?OK, I tried to implement remove
for Iter
and Range
, but it turns out to be much more complicated than expected.
On the other hand, O(1) bulk removal seems to be quite doable in a couple of weeks.
So, in this issue, as originally suggested, I'll directly implement remove_range
.
O(1) bulk removal would be more than optimal. Also absolutely no rush on this, and thank you again!
You're welcome! Just note that, O(1) will be a best-case scenario, and most of the time, removing entries close to start and end bounds will require tree traversal.
FYI: uploaded SCC 2.0.9 which contains an unoptimized version of remove_range
. To be optimized later.
FYI: uploaded 2.0.14 that contains the proposed optimized version of remove_range
.
-> Time complexity is O(depth = logN), independent of the range.
-> Not 100% sure about correctness of the method. If the method doesn't seem function correctly, please use for (k, _) in tree.range(a..b, guard) { tree.remove(k); }
instead.
Would it be possible to add an efficient
remove_range
ordrain
-like API onTreeIndex
?I'd like to use a
TreeIndex<u64, Arc<[u8]>>
as a queue where many readers keep a pointer into it by index, so they can keep up at their own pace, and periodically clear old entries like.remove_range(..least_recent_index)
. The alternative would be just repeated calls to.remove
with a decrementing counter until it returns false.