rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.38k stars 12.73k forks source link

Tracking Issue for map_many_mut #97601

Open Urgau opened 2 years ago

Urgau commented 2 years ago

Feature gate: #![feature(map_many_mut)]

This is a tracking issue for the HashMap::get_many{,_unchecked}_mut functions.

Attempts to get mutable references to N values in the map at once.

Public API

// in HashMap

pub fn get_many_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N]
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq;

pub unsafe fn get_many_unchecked_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N],
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq;

Steps / History

Unresolved Questions

ibraheemdev commented 2 years ago

Would it be better for this to return Result<[Option<&mut V>; N], DuplicateKeys>?

Urgau commented 2 years ago

Would it be better for this to return Result<[Option<&mut V>; N], DuplicateKeys>?

I don't think so, returning a Result<_, DuplicateKeys> would different from any other get methods on HashMap; they all return Option<_> of something; And returning a individual Option for every entry would completely change the semantic of the function and make things more difficult for peoples who wants all or nothing to achieve it.

Arnavion commented 2 years ago

And returning a individual Option for every entry would completely change the semantic of the function and make things more difficult for peoples who wants all or nothing to achieve it.

Sure, but the current implementation makes things more difficult for people who want to do multiple lookups in parallel and then do some mutations based on which keys exist and don't exist.

Returning [Option<&mut>] would not only provide strictly more information, but can be converted to Option<[&mut]> using result.try_map(std::convert::identity) (as suggested by AstrallyForged on IRC).


If I follow the history correctly, the libstd implementation returns Option<[&mut]> because hashbrown's does. hashbrown's implementation originally returned [Result<&mut>] but this was then changed to Option<[&mut]> to align with libstd's [T]::get_many_mut. So perhaps we need to start there.

workingjubilee commented 2 years ago

Bikesheddy note: this is very similar to the notion of a "gather" (a la Simd::gather_*) and that term is used extensively in similar interfaces (though it is also called something like "indexed vector load" in some cases).

forrestli74 commented 2 years ago

This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort. We don't have to support the general form now, but should think about how to design the API that adding general form is easy.

mqudsi commented 1 year ago

Are there any performance benefits to this other than eliminating a for loop caller-side? Can someone explain the motivation or a use-case for this, given that the code this replaces is rather short and to-the-point?

smarnach commented 1 year ago

@mqudsi It's currently impossible in safe code to get simultaneous mutable references to different values in a HashMap, and the unsafe code needed to do this in a sound way is not straightforward at all. You can easily get shared references to multiple values, which is why there is only get_many_mut() and no get_many(), but for multiple mutable references the new function is helpful.

Uriopass commented 1 year ago

What about BTreeMap ? Would it be possible to have get_many_mut and get_many_unchecked_mut there too ?

Darksonn commented 1 year ago

and the unsafe code needed to do this in a sound way is not straightforward at all

Can it be done at all?

nagisa commented 1 year ago

Rather than returning references to values, could this be an entry-based API instead? That would forgo the questions about people wanting to potentially make multiple lookups without them necessarily failing if some of them are missing. The current proposed function could easily be built on top of an entry-based API, but vice-versa is not possible.

Arnavion commented 1 year ago

Does it work to have multiple Entrys in flight simultaneously? Changing one of them from Vacant to Occupied might require reallocation which would invalidate the other Occupieds.

JustForFun88 commented 1 year ago

This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort...

I implemented a similar thing in this pull request: https://github.com/rust-lang/hashbrown/pull/408

ilyvion commented 1 year ago

This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort...

I implemented a similar thing in this pull request: rust-lang/hashbrown#408

I don't see how that does what was requested. As far as I can tell based on that PR, your changes still require a const N: usize, i.e. it's a decided-at-compile-time number of queries and results, not a decided-at-runtime number of queries and results like Vec or a non-array-based iterator could support.

DrAlta commented 8 months ago

What about BTreeMap ? Would it be possible to have get_many_mut and get_many_unchecked_mut there too ?

I assume they are waiting for this to reach stable for HashMaps before bringing it to BTreeMap.

But if you just want something that works

https://github.com/DrAlta/rust_quality_of_life.git

implements .get_many_mut() for BTreeMap in the GetManyMut trait

Thou it doesn't go poking around the innards of the BTreeMap it just iters over it and returns the wanted bits.

joshtriplett commented 1 month ago

We just had a very long conversation about this in today's @rust-lang/libs-api method.

The outcome of that discussion was that we'd like to change the signatures to:

// in HashMap

/// # Panics
///
/// Panics if the same index was passed more than once.
pub fn get_many_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N]
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq;

/// UB if the same index was passed more than once.
pub unsafe fn get_many_unchecked_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N],
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq;

The two changes here are 1) panicking on duplicate indexes and 2) returning an array of Option rather than an Option of array.

Our rationales for both of these are based on an expected usage model of either having an array of N keys that are statically known at compile time or having an array of most commonly 2 or perhaps 3 keys that are dynamically supplied by the user.

Rationale for 1: In the static case you know there aren't duplicates, and in the dynamic case you have few enough items that you can easily if k1 == k2 (or if src_key == dest_key) and that you probably want to check that case in advance to give a better error and avoid doing the lookups.

Rationale for 2: In the dynamic case, we expect the common case to be 2 keys, and it's extremely easy to let [Some(v1), Some(v2)] = map.get_many_mut([k1, k2]) else { ... }, or use a match if you want to be able to handle those cases. We expect that people will commonly want to report "source doesn't exist" vs "destination doesn't exist", or handle those cases such as by creating the missing value. In the static case, you might want the unchecked variant for performance if you know every item will exist, or you might want to call .expect, but it'd be trivial to provide a helper method .transpose() that turns array-of-Option into Option-of-array (and we think that method should exist; it already does for array-of-MaybeUninit).

clarfonthey commented 1 month ago

Going to work on a PR to update this API to match the recent changes, since it's blocking another PR of mine.

jdahlstrom commented 1 month ago

Just to make sure the consistency issue is noticed: <[T]>::get_many_mut() is currently being stabilized to return Option<[T; N]> Result<[&mut T; N], GetManyMutError<N>> rather than what libs-api requested here.

Urgau commented 3 weeks ago

Now that the unresolved questions have been resolved thanks to https://github.com/rust-lang/rust/issues/97601#issuecomment-2372109119, I would like to propose to stabilize this feature.

Stabilization Report

Implementation History

API Surface

// in HashMap

/// Attempts to get mutable references to `N` values in the map at once.

/// Panics if the same index was passed more than once.
pub fn get_many_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N]
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq + ?Sized;

/// UB if the same index was passed more than once.
pub unsafe fn get_many_unchecked_mut<Q: ?Sized, const N: usize>(
    &mut self,
    ks: [&Q; N],
) -> [Option<&'_ mut V>; N]
where
    K: Borrow<Q>,
    Q: Hash + Eq + ?Sized;

Experience Report

The compiler has been happily using get_many_mut since it's introduction. It permits us to optimize away thousands of lookups in a hot-path.

Example Usages

cc @rust-lang/libs-api

joshtriplett commented 3 weeks ago

@rfcbot merge

rfcbot commented 3 weeks ago

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.