initial implementation of rw_array based on const generics

geeklint commented 2 years ago

Here's another idea I had. There are the existing functions on cell owners rw, rw2, and rw3, which can borrow multiple cells at a time, but that method is limited by how many arity variants this crate wants to implement. With const generics, we can provide a version of the rw function which supports many more simultaneous borrows.

This is limited to only an array of cells with the same generic type, although in practice that is a very common use case.

I added this behind a const_generics feature flag, so that it could be added without bumping MSRV. By putting the methods in a separate module, an older compiler can still compile it as long as that feature flag disabled. It has the unfortunate side effect of requiring the inner UnsafeCell in the cell types to become pub(crate) though.

The main limiting factor in this implementation is the function which checks for duplicates (for safety). I benchmarked a couple implementations, and settled on two, depending on the size of the array, the comments in the code should explain them. I'm attaching the benchmark file I used and the output graph from my machine. I'm using the sealed trait pattern to limit it to 255 simultaneous cells, as that limit provides the easiest ways to implement the uniqueness check. You get a compile-time error if you try to use it with a longer array, which should look something like

the trait `LessThan256` is not implemented for `ValueOf<300_usize>`

which seems like a decent self-explanatory error to me.

Also submitting this one as a draft PR, without a test suite.

Benchmark results:

bench-plot main.rs.txt

uazu commented 2 years ago

Just because we can do something doesn't mean we should. Can I back up a bit and ask about motivation?

Do you have a motivating example of where this is needed? When I looked at rw2 and rw3 I looked around my code and tried to think of examples where I'd need to access to more than one item. Actually I had cases where I needed to access 2 at the same time, e.g. swapping data around efficiently. I could just about imagine needing 3, but really I couldn't imagine needing 4.

The other thing is that I think you're creating ~256 types. Is this going to blow up time or space resource requirements for the crate?

geeklint commented 2 years ago

Just because we can do something doesn't mean we should.

Totally fair, probably it's a good idea to table this one for now until someone actually asks for a >3 arity method. Although, if someone does come along, I think we should consider this general solution over adding a rw4+.

The other thing is that I think you're creating ~256 types. Is this going to blow up time or space resource requirements for the crate?

The types are empty and only used for a where clause, I don't believe they will have any impact at runtime (even for debug). It could have an impact on compile time (my testing says it's about a 100ms bump on my machine); but then, only if the user enabled the feature flag.

uazu commented 2 years ago

Okay, thanks. I think I'd need to see an example even if someone claimed they needed 4 or more, because there may be another way that is more efficient. For example if not all combinations of that set of values are needed, then just doing a series of rw2 operations might work out as less comparisons and faster. But certainly it would be worth coming back to this patch if there is a genuine requirement. I've written up issue #27 to see if anyone passes by who has such a requirement.

tuguzT commented 2 years ago

I think this implementation could be extended to work not only for arrays, but for vectors with unknown at compile time length.

geeklint commented 2 years ago

I've thought about that, and actually do have a use case where I could possibly use such a thing, however there are additional complexities. If we're just talking about a &[QCell<T>], it's pretty simple (different slots obviously are different cells), however at least in my use case you're rarely looking at that (since, in the cases you might encounter that, it would be better to use &QCell<[T]>). Instead, I have something like &[Rc<QCell<T>>] and while I "know" they're unique, there's no way to guarantee that at compile time.

To provide a safe interface there, qcell would have to do some kind of uniqueness tracking (e.g. keeping a temporary HashSet around while iterating to make sure we never yield the same pointer twice). HashSets aren't free (from runtime performance costs) and have the caveat that they're presently only available in std (no_std can depend directly on hashbrown, but then that's an additional dependency).

tuguzT commented 2 years ago

In my use case, I use HashMap with TypeId keys and I want to get multiple mutable references on several values by unique keys. I don't know if this is possible using &QCell<[T]> (without collecting values into a vector), but it could be with &[&QCell<T>].

geeklint commented 2 years ago

Right, and that requires the same type of runtime uniqueness checking. It does seem like multiple unique borrows from a hashmap is a bit of an API hole that qcell might be positioned to solve, although I'm curious what precise benefit this hypothetical qcell API has over HashMap::iter_mut.

uazu commented 2 years ago

For uniqueness checking:

Obvious implementation is O(n^2), i.e. check each address against all following ones
Maybe could be implemented as a sort, then check for dups, i.e. O(N.logN), which could be in-place

There is no way to avoid this overhead safely unless some static guarantees can be made, which in general can't. Regarding the issue, I think breaking things down, i.e. putting the cells at a finer level, and doing cell-access at a finer level may get around most cases. But really if there is some case that definitely needs code to handle 3+ simultaneous borrow, I'm still willing to be persuaded.

uazu / qcell

initial implementation of rw_array based on const generics #26