Closed apasel422 closed 9 years ago
The doc comments for the new structs also need to be updated.
Still reading, might find answers eventually, but just asking this now:
Do we still need the warts of RevIter and lower/upper_bound now that we have this nice generic tooling?
Have you benched other search time changes after this refactoring?
Sort of. Ultimately, it would be nice to do this, per your Get
trait proposal:
let map: TreeMap<u32, char> = ...
let value: Option<&char> = map.get(&1);
let iter_from: SomeIter = map.get(1..);
let iter_to: SomeIter = map.get(..1);
let iter_range: SomeIter = map.get(1..3);
let iter_all: Iter = map.get(..);
with mutable variants. But that won't provide distinction between open and closed bounds (the {lower,upper}_bound
methods should be renamed anyway) until we get such a range type in std
, or the RevIter
issue, because it is not trivial to implement DoubleEndedIterator
, at least not without parent pointers. I'm not familiar with existing algorithms for it, but the naive (only?) approach would be to maintain two stacks of nodes.
As for benching, I actually believe that the benches for searching are currently useless due to the lack of black_box
. I'll investigate and post some numbers.
Note that returning from the bench closure feeds that value into a black_box (dunno what the benches look like today).
Another proposal I've been rattling around in my head has been:
fn pred(&self, Bound<&K>) -> Option<&V>
fn succ(&self, Bound<&K>) -> Option<&V>
(plus mutable variants)
where Inclusive and Exclusive are as expected, and pred(Unbounded) == max
and succ(Unbounded) == min
. A bit of a hack to reduce the combinatoric explosion in the api of wanting to support find_(pred|succ)_(inclusive|exclusive)_(mut?) + (min|max)_(mut?)
. Although in reality the combinatorics are there, they've just been squished a bit to passing in a Bound instead of calling a particular method. Maybe that makes things a bit too messy, though (and the min/max hack is maybe more cute than useful).
In theory you want these methods in addition to the iterator-y ones because the iterators need to do more work storing search paths and whatnot.
Yeah, unfortunately the current benches have a trailing semi-colon, so I think they're feeding ()
.
As for your proposal, that's certainly a possibility. I will explore a bit more. I think we can leverage traits and type-level bounds (struct Closed<T>(T); struct Open<T>(T);
) to reduce the surface area.
Another combinatoric explosion is waiting to happen via the Entry
API, which hasn't been implemented for TreeMap
yet (but which is on my to-do list).
For the benches, it would be great if you could bring in what you did for rust-lang/rust#22151.
I have an alternative trait-based design where you can do:
let map: TreeMap<u32, &str> = ...;
map.get(Min);
map.get(Lt(&1));
map.get(Le(&1));
map.get(&1);
map.get(Ge(&1));
map.get(Gt(&1));
map.get(Max);
with mutable variants. The struct names can be changed to better indicate what they do. I have to investigate how feasible this is for the iterator and entry methods (i.e. without code duplication), but it would certainly reduce the number of methods.
Oh that's a really nice design!
I'm still working on the design, so I'd hold off on continuing the review for now.
Benches before 94ee0ae:
tree::map::bench::find_rand_100 ... 7 ns/iter (+/- 1)
tree::map::bench::find_rand_10_000 ... 76 ns/iter (+/- 14)
tree::map::bench::find_seq_100 ... 7 ns/iter (+/- 4)
tree::map::bench::find_seq_10_000 ... 39 ns/iter (+/- 3)
Benches after 94ee0ae:
tree::map::bench::find_rand_100 ... 7 ns/iter (+/- 1)
tree::map::bench::find_rand_10_000 ... 74 ns/iter (+/- 6)
tree::map::bench::find_seq_100 ... 7 ns/iter (+/- 1)
tree::map::bench::find_seq_10_000 ... 40 ns/iter (+/- 4)
One downside of this is that it requires importing Max
/Min
, and later it will require importing the predecessor/successor queries. Inherent methods do have an advantage there.
Yeah. Also we're not really getting rid of combinatorics, just adding a weird calling convention. Internally all the combinatorics are still there, right?
Yes, though this is moving toward less redundancy in the implementation too. In order to reduce it further, I'd like to be able to just transmute
from immutable references to mutable ones when we know it's safe to do so. My goal with the design, though it'll show more once the Entry
and removal operations are supported, is to be able to create an Entry
(or perform removal) without the queries themselves knowing how to do so. (A wrapper function builds a stack of nodes as the query itself progresses down the tree.) This is very simple to achieve with equality and min/max, but harder to do automatically with predecessor/successor.
Externally, we're either going to have:
map.get(&1 / Succ(&1) / Max);
map.get_mut(&1 / Succ(&1) / Max);
map.remove(&1 / Succ(&1) / Max);
or
map.get(&1);
map.get_mut(&1);
map.remove(&1);
map.succ(&1);
map.succ_mut(&1);
map.remove_succ(&1);
map.max();
map.max_mut();
map.remove_max();
To say nothing of the Entry
API, Pred
, inclusive versions of Pred
and Succ
(which could be represented as separate types, separate methods, or a bool
parameter), Min
, iterators (which come in reverse and mutable varieties), or the methods that return both the key and value (the latter of which can be mutable).
Of course, {get, get_mut, remove}
could all be replaced with just the Entry
API (using either of the above conventions), but that would be unergonomic and inefficient.
Java's NavigableMap
interface is a good example of the combinatoric issue.
Closing in favor of a more principled future approach.
A lot of the redundant impls could be removed if we used type aliases instead of wrapper types.
Benches: