rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.27k stars 12.71k forks source link

Tracking Issue for get_many_mut #104642

Open Mark-Simulacrum opened 1 year ago

Mark-Simulacrum commented 1 year ago

Feature gate: #![feature(get_many_mut)]

This is a tracking issue for get_many_mut and get_many_unchecked_mut, which provide &mut T access to multiple distinct slice elements.

Public API

impl [T] {
    pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
    pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Result<[&mut T; N], GetManyMutError<N>>;
}

pub struct GetManyMutError<const N: usize> { /* private */ }

Steps / History

Unresolved Questions

shepmaster commented 1 year ago

Is there a reason that core::slice::GetManyMutError isn't re-exported as std::slice::GetManyMutError?

orlp commented 1 year ago

Since N is a compile-time constant anyway I would suggest adding a check in get_many_check_valid that switches to sort_unstable for, say, N > 10. This code gets optimized out entirely for N <= 10, and vice versa.

fn get_many_check_valid<const N: usize>(indices: &[usize; N], len: usize) -> bool {
    let mut valid = true;
    if N <= 10 {
        // NB: The optimzer should inline the loops into a sequence
        // of instructions without additional branching.
        for (i, &idx) in indices.iter().enumerate() {
            valid &= idx < len;
            for &idx2 in &indices[..i] {
                valid &= idx != idx2;
            }
        }
    } else {
        let mut sorted = *indices;
        sorted.sort_unstable();
        for i in 1..N {
            valid &= sorted[i-1] < sorted[i];
        }
        valid &= sorted[N-1] < len;
    }

    valid
}

On another note, perhaps we should have a DisjointIndices<const N: usize> type where DisjointIndices::new calls get_many_check_valid. Then indexing with DisjointIndices does not need to repeat the check, allowing you to amortize the cost of checking disjointness over many operations without needing unsafe code. This type could also have a DisjointIndices::new_sorted constructor that allows the user to make the disjoint check faster if they know their input indices are sorted.

ajwock commented 1 year ago

I also like the idea of DisjointedIndices for this. And since a pair of indices is likely a common use case for this, perhaps something that could be available would be something like disjoint_pair(lhs: usize, rhs: usize) -> DisjointIndices<2> (panics if lhs == rhs).

pervognsen commented 1 year ago

I've tried to use this a few times now. The fact that the current placeholder error type doesn't distinguish out-of-bounds vs duplicate indices makes it hard to even test in something resembling a real use case. For example, if you wanted to implement slice::swap via slice::get_many_mut (not saying you should), you'd want the duplicate index case to be a nop, but you'd still want the out-of-bounds case to panic. A lot of my hypothetical use cases for get_many_mut have this "multi-object transaction" flavor to them.

Related: Would it make sense to have an implementation of IndexMut for [usize; N] that panics in the duplicates case? The duplicate indices issue seems like it falls into the same bucket of dynamic borrow conflicts you have with RefCell::borrow/borrow_mut (which panics unlike try_borrow/try_borrow_mut), so it seems like a reasonable behavior when you have an upstream invariant that is expected to guarantee uniqueness but you still need the dynamic check for safety.

Splitting the duplicate checking via a DisjointIndices type seems to help with some of these issues as well. In the case of get_many_mut, it means you only have one type of error (out of bounds) as with the plain get method. And if we eventually end up with all these related APIs (e.g. Index and IndexMut can be implemented for DisjointIndices), the stage separation of disjoint index handling helps to eliminate redundancy in both the implementation and interfaces (e.g. error types). And in the aforementioned case where you have upstream invariants that guarantee uniqueness, it means you can propagate that invariant using a DisjointIndices value as a proof witness. (The proof witness idea also works for bounds checking. It doesn't make much sense for a singular index with a known bound since that known bound must still be checked against the slice length, but once you have multiple indices packaged together with a known bound it means you can get away with only one bounds check against the slice length instead of one for each index.)

kupiakos commented 11 months ago

For embedded use cases, I'd strongly prefer we limit the number of panics added by this feature, limiting it only to out-of-bounds as it is today.

Would it make sense to have an implementation of IndexMut for [usize; N] that panics in the duplicates case?

There's not really a way to implement IndexMut for multiple indices - what would be the Output type? If the Output type is [&mut T] then you would need some place to store it for index_mut to return &mut Self::Output.

The same applies to .get_mut - it would be awesome to write let [e0, e10] = foo.get_mut(DisjointIndices::new([1,2]).unwrap()) but there's no way to do it. If we'd had GATs from the beginning then Index could've been type Output<'a>; and indexing would be far more powerful.

mcmah309 commented 6 months ago

There is also this crate that introduces some macros and methods for retrieving multiple mutable indices from a mutable slice https://crates.io/crates/indices . It uses insertion sort so more optimized towards a small number of indices and near sorted indices. It also has macros that assume the input is already sorted, requiring no sorting at all.

tgross35 commented 3 months ago

If there isn't any specific reason not to do this, I think we should change to an interface based on SliceIndex. This is consistent with get{,_mut} and should make ranges work. It should also be more compatible with using smaller integer types to index into slices, if we ever get that to work.

fn get_many_mut<I, const N: usize>(
    &mut self,
    indices: [I; N],
) -> Result<[&mut <I as SliceIndex<[T]>>::Output; N], GetManyMutError<N>>
where
    // If we switch to using `DisjointIndices` as suggested above, `Ord` is not
    // needed here.
    I: SliceIndex<[T]> + Ord
{ /* ... */ }

Cc @ChayimFriedman2 since you have a stabilization PR up.

scottmcm commented 3 months ago

If it's going to go through a trait, should it perhaps be indices: I?

That would leave space open to also allow things like get_many_mut((..4, 4, 5..)) in future, should we decide to allow it.

tguichaoua commented 3 months ago

That would leave space open to also allow things like get_many_mut((..4, 4, 5..)) in future, should we decide to allow it.

I make a POC to experiment with this idea.

Amanieu commented 1 month ago

We discussed this in the libs-api meeting yesterday and concluded that we are mostly happy with the shape that the API has today. We would like it to be extended to support any index that implements SliceIndex. We are deferring on tuple support for now.

scottmcm commented 1 month ago
  • What are the requirements for the interface? Do we sort the passed array, and then check distinctness, or check in O(n^2) time?

Attempting to answer this based on what I half-remember from the meeting:

The 99% case here is 2 (or maybe 3) indices. As such, there's should not be a requirement that they're sorted, and there's no need to do optimizations for many-index cases. Someone looking to use a more complicated scenario -- which wants to pre-check disjointness or sortedness of indices, say -- is expected to make their own wrapper around get_many_unchecked_mut, which of course doesn't do any O(n²) checks.

ChayimFriedman2 commented 1 month ago

@Amanieu Is the team interested in adding range support now, or deferring it? Also, it may cause inference problems later if we make it generic.

Amanieu commented 1 month ago

Yes, we would like to add range support now.