orium / rpds

Rust persistent data structures
Mozilla Public License 2.0
1.22k stars 57 forks source link

Add RedBlackTree::range #39

Closed jneem closed 5 years ago

jneem commented 5 years ago

This implementation is based on the suggestion in #23, but I ended up diverging a bit from that. In particular, I'm not implementing IterArc in terms of RangeIter, because the termination logic ended up being different and so there wasn't much code-sharing to be had. Also, the logic in RangeIter is much nicer if I don't include the lazy-initialization optimization for the stacks.

I did, however, attempt to reuse some code between IterArc and RangeIter by factoring out some utilities for managing the stacks.

orium commented 5 years ago

Hi @jneem,

thanks for the PR. I didn't do a full review but from a quick look: I would prefer to have an range iterator over Arc, i.e. following the exact same pattern as for the normal iterator. The reason for this it that at some point we might want to expose this to the end user, and it can be also helpful to implement some internal operations.

I'm not implementing IterArc in terms of RangeIter, because the termination logic ended up being different and so there wasn't much code-sharing to be had.

Even if we don't cut down a lot of code I think it is worth doing. It will make IterArc almost trivially correct since it could be defined like:

#[derive(Debug)]
pub struct IterArc<'a, K: 'a, V: 'a> {
    range_iter: IterArcRange<'a, K, V>,

    /// Number of elements left in the iterator.
    size:  usize,
}

and pretty much all the code it needs is to update size (needed for the size_hint()).

jneem commented 5 years ago

Thanks for the comments. Indeed, it seems nicer this way. The only thing is that you lose the lazy initialization of the two stacks.

jneem commented 5 years ago

Thanks for the comments! I think I can easily fix everything except for this one:

I have a bunch of suggestions, but in terms of behavior the only think that should change is making it lazily initialized.

There are various problems with lazy initialization of range() (which is why I originally didn't implement iter() in terms of range(), in order to preserve lazy initialization for iter()):

orium commented 5 years ago

Alright. I will take a look at how to do that in a nice way in the future, but for now lets drop the lazy initialization.

In that case update the complexity table here: https://github.com/orium/rpds/blob/633774d6b9cd62b9f3997452815f8a7f79376318/src/set/red_black_tree_set/mod.rs#L61

orium commented 5 years ago

Awesome work. Thanks @jneem.