Open Urgau opened 2 years ago
Would it be better for this to return Result<[Option<&mut V>; N], DuplicateKeys>
?
Would it be better for this to return
Result<[Option<&mut V>; N], DuplicateKeys>
?
I don't think so, returning a Result<_, DuplicateKeys>
would different from any other get
methods on HashMap
; they all return Option<_>
of something; And returning a individual Option
for every entry would completely change the semantic of the function and make things more difficult for peoples who wants all or nothing to achieve it.
And returning a individual Option for every entry would completely change the semantic of the function and make things more difficult for peoples who wants all or nothing to achieve it.
Sure, but the current implementation makes things more difficult for people who want to do multiple lookups in parallel and then do some mutations based on which keys exist and don't exist.
Returning [Option<&mut>]
would not only provide strictly more information, but can be converted to Option<[&mut]>
using result.try_map(std::convert::identity)
(as suggested by AstrallyForged on IRC).
If I follow the history correctly, the libstd implementation returns Option<[&mut]>
because hashbrown's does. hashbrown's implementation originally returned [Result<&mut>]
but this was then changed to Option<[&mut]>
to align with libstd's [T]::get_many_mut
. So perhaps we need to start there.
Bikesheddy note: this is very similar to the notion of a "gather" (a la Simd::gather_*
) and that term is used extensively in similar interfaces (though it is also called something like "indexed vector load" in some cases).
This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort. We don't have to support the general form now, but should think about how to design the API that adding general form is easy.
Are there any performance benefits to this other than eliminating a for loop caller-side? Can someone explain the motivation or a use-case for this, given that the code this replaces is rather short and to-the-point?
@mqudsi It's currently impossible in safe code to get simultaneous mutable references to different values in a HashMap
, and the unsafe code needed to do this in a sound way is not straightforward at all. You can easily get shared references to multiple values, which is why there is only get_many_mut()
and no get_many()
, but for multiple mutable references the new function is helpful.
What about BTreeMap ? Would it be possible to have get_many_mut
and get_many_unchecked_mut
there too ?
and the unsafe code needed to do this in a sound way is not straightforward at all
Can it be done at all?
Rather than returning references to values, could this be an entry-based API instead? That would forgo the questions about people wanting to potentially make multiple lookups without them necessarily failing if some of them are missing. The current proposed function could easily be built on top of an entry-based API, but vice-versa is not possible.
Does it work to have multiple Entry
s in flight simultaneously? Changing one of them from Vacant
to Occupied
might require reallocation which would invalidate the other Occupied
s.
This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort...
I implemented a similar thing in this pull request: https://github.com/rust-lang/hashbrown/pull/408
This seems very restrictive on the number of elements at the same time. Would be nice to have a version that supports Vec or Iterator of some sort...
I implemented a similar thing in this pull request: rust-lang/hashbrown#408
I don't see how that does what was requested. As far as I can tell based on that PR, your changes still require a const N: usize
, i.e. it's a decided-at-compile-time number of queries and results, not a decided-at-runtime number of queries and results like Vec
or a non-array-based iterator could support.
What about BTreeMap ? Would it be possible to have
get_many_mut
andget_many_unchecked_mut
there too ?
I assume they are waiting for this to reach stable for HashMaps before bringing it to BTreeMap.
But if you just want something that works
https://github.com/DrAlta/rust_quality_of_life.git
implements .get_many_mut() for BTreeMap in the GetManyMut trait
Thou it doesn't go poking around the innards of the BTreeMap it just iters over it and returns the wanted bits.
We just had a very long conversation about this in today's @rust-lang/libs-api method.
The outcome of that discussion was that we'd like to change the signatures to:
// in HashMap
/// # Panics
///
/// Panics if the same index was passed more than once.
pub fn get_many_mut<Q: ?Sized, const N: usize>(
&mut self,
ks: [&Q; N]
) -> [Option<&'_ mut V>; N]
where
K: Borrow<Q>,
Q: Hash + Eq;
/// UB if the same index was passed more than once.
pub unsafe fn get_many_unchecked_mut<Q: ?Sized, const N: usize>(
&mut self,
ks: [&Q; N],
) -> [Option<&'_ mut V>; N]
where
K: Borrow<Q>,
Q: Hash + Eq;
The two changes here are 1) panicking on duplicate indexes and 2) returning an array of Option rather than an Option of array.
Our rationales for both of these are based on an expected usage model of either having an array of N keys that are statically known at compile time or having an array of most commonly 2 or perhaps 3 keys that are dynamically supplied by the user.
Rationale for 1: In the static case you know there aren't duplicates, and in the dynamic case you have few enough items that you can easily if k1 == k2
(or if src_key == dest_key
) and that you probably want to check that case in advance to give a better error and avoid doing the lookups.
Rationale for 2: In the dynamic case, we expect the common case to be 2 keys, and it's extremely easy to let [Some(v1), Some(v2)] = map.get_many_mut([k1, k2]) else { ... }
, or use a match if you want to be able to handle those cases. We expect that people will commonly want to report "source doesn't exist" vs "destination doesn't exist", or handle those cases such as by creating the missing value. In the static case, you might want the unchecked
variant for performance if you know every item will exist, or you might want to call .expect
, but it'd be trivial to provide a helper method .transpose()
that turns array-of-Option into Option-of-array (and we think that method should exist; it already does for array-of-MaybeUninit).
Going to work on a PR to update this API to match the recent changes, since it's blocking another PR of mine.
Just to make sure the consistency issue is
noticed: <[T]>::get_many_mut()
is currently being stabilized to return Option<[T; N]>
Result<[&mut T; N], GetManyMutError<N>>
rather than what libs-api requested here.
Now that the unresolved questions have been resolved thanks to https://github.com/rust-lang/rust/issues/97601#issuecomment-2372109119, I would like to propose to stabilize this feature.
// in HashMap
/// Attempts to get mutable references to `N` values in the map at once.
/// Panics if the same index was passed more than once.
pub fn get_many_mut<Q: ?Sized, const N: usize>(
&mut self,
ks: [&Q; N]
) -> [Option<&'_ mut V>; N]
where
K: Borrow<Q>,
Q: Hash + Eq + ?Sized;
/// UB if the same index was passed more than once.
pub unsafe fn get_many_unchecked_mut<Q: ?Sized, const N: usize>(
&mut self,
ks: [&Q; N],
) -> [Option<&'_ mut V>; N]
where
K: Borrow<Q>,
Q: Hash + Eq + ?Sized;
The compiler has been happily using get_many_mut
since it's introduction.
It permits us to optimize away thousands of lookups in a hot-path.
cc @rust-lang/libs-api
@rfcbot merge
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Feature gate:
#![feature(map_many_mut)]
This is a tracking issue for the
HashMap::get_many{,_unchecked}_mut
functions.Attempts to get mutable references to
N
values in the map at once.Public API
Steps / History
Unresolved Questions
Should the return type be insteadhttps://github.com/rust-lang/rust/issues/97601#issuecomment-2372109119Result<[Option<&mut V>; N], DuplicateKeys>
? https://github.com/rust-lang/rust/issues/97601#issuecomment-1146872856 and https://github.com/rust-lang/rust/issues/97601#issuecomment-1158233416Having instead a entry-like API? https://github.com/rust-lang/rust/issues/97601#issuecomment-1553012637https://github.com/rust-lang/rust/issues/97601#issuecomment-2372109119