Open Mark-Simulacrum opened 1 year ago
Is there a reason that core::slice::GetManyMutError
isn't re-exported as std::slice::GetManyMutError
?
Since N
is a compile-time constant anyway I would suggest adding a check in get_many_check_valid
that switches to sort_unstable
for, say, N > 10. This code gets optimized out entirely for N <= 10
, and vice versa.
fn get_many_check_valid<const N: usize>(indices: &[usize; N], len: usize) -> bool {
let mut valid = true;
if N <= 10 {
// NB: The optimzer should inline the loops into a sequence
// of instructions without additional branching.
for (i, &idx) in indices.iter().enumerate() {
valid &= idx < len;
for &idx2 in &indices[..i] {
valid &= idx != idx2;
}
}
} else {
let mut sorted = *indices;
sorted.sort_unstable();
for i in 1..N {
valid &= sorted[i-1] < sorted[i];
}
valid &= sorted[N-1] < len;
}
valid
}
On another note, perhaps we should have a DisjointIndices<const N: usize>
type where DisjointIndices::new
calls get_many_check_valid
. Then indexing with DisjointIndices
does not need to repeat the check, allowing you to amortize the cost of checking disjointness over many operations without needing unsafe
code. This type could also have a DisjointIndices::new_sorted
constructor that allows the user to make the disjoint check faster if they know their input indices are sorted.
I also like the idea of DisjointedIndices for this. And since a pair of indices is likely a common use case for this, perhaps something that could be available would be something like disjoint_pair(lhs: usize, rhs: usize) -> DisjointIndices<2> (panics if lhs == rhs).
I've tried to use this a few times now. The fact that the current placeholder error type doesn't distinguish out-of-bounds vs duplicate indices makes it hard to even test in something resembling a real use case. For example, if you wanted to implement slice::swap via slice::get_many_mut (not saying you should), you'd want the duplicate index case to be a nop, but you'd still want the out-of-bounds case to panic. A lot of my hypothetical use cases for get_many_mut have this "multi-object transaction" flavor to them.
Related: Would it make sense to have an implementation of IndexMut for [usize; N] that panics in the duplicates case? The duplicate indices issue seems like it falls into the same bucket of dynamic borrow conflicts you have with RefCell::borrow/borrow_mut (which panics unlike try_borrow/try_borrow_mut), so it seems like a reasonable behavior when you have an upstream invariant that is expected to guarantee uniqueness but you still need the dynamic check for safety.
Splitting the duplicate checking via a DisjointIndices type seems to help with some of these issues as well. In the case of get_many_mut, it means you only have one type of error (out of bounds) as with the plain get method. And if we eventually end up with all these related APIs (e.g. Index and IndexMut can be implemented for DisjointIndices), the stage separation of disjoint index handling helps to eliminate redundancy in both the implementation and interfaces (e.g. error types). And in the aforementioned case where you have upstream invariants that guarantee uniqueness, it means you can propagate that invariant using a DisjointIndices value as a proof witness. (The proof witness idea also works for bounds checking. It doesn't make much sense for a singular index with a known bound since that known bound must still be checked against the slice length, but once you have multiple indices packaged together with a known bound it means you can get away with only one bounds check against the slice length instead of one for each index.)
For embedded use cases, I'd strongly prefer we limit the number of panics added by this feature, limiting it only to out-of-bounds as it is today.
Would it make sense to have an implementation of IndexMut for [usize; N] that panics in the duplicates case?
There's not really a way to implement IndexMut
for multiple indices - what would be the Output
type? If the Output
type is [&mut T]
then you would need some place to store it for index_mut
to return &mut Self::Output
.
The same applies to .get_mut
- it would be awesome to write let [e0, e10] = foo.get_mut(DisjointIndices::new([1,2]).unwrap())
but there's no way to do it. If we'd had GATs from the beginning then Index
could've been type Output<'a>;
and indexing would be far more powerful.
There is also this crate that introduces some macros and methods for retrieving multiple mutable indices from a mutable slice https://crates.io/crates/indices . It uses insertion sort so more optimized towards a small number of indices and near sorted indices. It also has macros that assume the input is already sorted, requiring no sorting at all.
If there isn't any specific reason not to do this, I think we should change to an interface based on SliceIndex
. This is consistent with get{,_mut}
and should make ranges work. It should also be more compatible with using smaller integer types to index into slices, if we ever get that to work.
fn get_many_mut<I, const N: usize>(
&mut self,
indices: [I; N],
) -> Result<[&mut <I as SliceIndex<[T]>>::Output; N], GetManyMutError<N>>
where
// If we switch to using `DisjointIndices` as suggested above, `Ord` is not
// needed here.
I: SliceIndex<[T]> + Ord
{ /* ... */ }
Cc @ChayimFriedman2 since you have a stabilization PR up.
If it's going to go through a trait, should it perhaps be indices: I
?
That would leave space open to also allow things like get_many_mut((..4, 4, 5..))
in future, should we decide to allow it.
That would leave space open to also allow things like
get_many_mut((..4, 4, 5..))
in future, should we decide to allow it.
I make a POC to experiment with this idea.
We discussed this in the libs-api meeting yesterday and concluded that we are mostly happy with the shape that the API has today. We would like it to be extended to support any index that implements SliceIndex
. We are deferring on tuple support for now.
- What are the requirements for the interface? Do we sort the passed array, and then check distinctness, or check in
O(n^2)
time?
Attempting to answer this based on what I half-remember from the meeting:
The 99% case here is 2 (or maybe 3) indices. As such, there's should not be a requirement that they're sorted, and there's no need to do optimizations for many-index cases. Someone looking to use a more complicated scenario -- which wants to pre-check disjointness or sortedness of indices, say -- is expected to make their own wrapper around get_many_unchecked_mut
, which of course doesn't do any O(n²) checks.
@Amanieu Is the team interested in adding range support now, or deferring it? Also, it may cause inference problems later if we make it generic.
Yes, we would like to add range support now.
Feature gate:
#![feature(get_many_mut)]
This is a tracking issue for
get_many_mut
andget_many_unchecked_mut
, which provide&mut T
access to multiple distinct slice elements.Public API
Steps / History
Unresolved Questions
get_many_mut([0..3, 4..6])
)?O(n^2)
time?