Open clarfonthey opened 2 years ago
I'm a bit concerned about the size and inconsistency of this API, as the corresponding functions for single values are all methods, while for arrays and slices we have associated functions. For example, array_assume_init
fundamentally does the same thing as assume_init
, but is called in a different way, with a different name. If the API was complete (zeroed_array
is currently missing, among others), we would have 8 or so functions for arrays alone (currently only two, but a lot of functionality still unnecessarily requires unsafe
).
I would rather prefer a trait-based API like this:
unsafe trait Uninitialized {
type Initialized: ?Sized;
fn uninit() -> Self
where Self: Sized;
fn zeroed() -> Self
where Self: Sized;
unsafe fn assume_init(self) -> Self::Initialized
where Self: Sized;
unsafe fn assume_init_ref(&self) -> &Self::Initialized;
...
}
unsafe impl<T> Uninitialized for MaybeUninit {...}
unsafe impl<T, const N: usize> Uninitialized for [T; N] {...}
unsafe impl<T> Uninitialized for [T] {...}
or with a marker trait like I proposed here.
I don't think it's worth making a trait for this. I've had many use cases for uninit_array before, and none for a generic solution. MaybeUninit is often used with arrays, since that's where initialization tends to be the most expensive.
While it is a trivial function to write yourself, I do think that it's worth it to stabilize maybe_uninit_uninit_array
. It's a very common use case, and the alternative is to write am unsettling MaybeUninit::uninit ().assume_init()
which is not very nice and makes it harder to audit the code.
Another thing worth asking: why not have methods directly implemented on arrays? You'd still need some form of uninit_array
, but instead of MaybeUninit::array_assume_init(arr)
, you'd just do arr.assume_init()
, and the implementation is on [MaybeUninit<T>; N]
.
This seems possible specifically because the types exist in core, but open to other interpretations.
I also suggested perhaps just adding Index
and IndexMut
implementations for MaybeUninit<[T]>
that return MaybeUninit<T>
and MaybeUninit<[T]>
references, then removing the transposed versions of [MaybeUninit<T>]
methods.
Of course, the weirdness of MaybeUninit<[T]>
is that the length of the slice isn't uninit, just the data in the slice itself, since the pointer metadata is totally separate. But I feel like this is only a weird quirk that quickly becomes natural after a short explanation. Plus, this applies already to other DSTs like dyn Trait
, even though a lot of methods require Sized
. In principle you could coerce &mut MaybeUninit<T>
into &mut MaybeUninit<dyn Trait>
if it were more idiomatic and that would make sense as a type, even though you couldn't do much with it.
@clarfonthey with the current implementation, MaybeUninit<[T]>
is not possible at all, MaybeUninit
requires T: Sized
.
Docs patch extracted from #102023:
Subject: [PATCH] Add MaybeUninit array transpose impls
Signed-off-by: Alex Saveau <saveau.alexandre@gmail.com>
---
Index: library/core/src/mem/maybe_uninit.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/library/core/src/mem/maybe_uninit.rs b/library/core/src/mem/maybe_uninit.rs
--- a/library/core/src/mem/maybe_uninit.rs (revision 8147e6e427a1b3c4aedcd9fd85bd457888f80972)
+++ b/library/core/src/mem/maybe_uninit.rs (date 1665873934720)
@@ -117,15 +117,12 @@
/// `MaybeUninit<T>` can be used to initialize a large array element-by-element:
///
/// ```
-/// use std::mem::{self, MaybeUninit};
+/// use std::mem::MaybeUninit;
///
/// let data = {
-/// // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
-/// // safe because the type we are claiming to have initialized here is a
-/// // bunch of `MaybeUninit`s, which do not require initialization.
-/// let mut data: [MaybeUninit<Vec<u32>>; 1000] = unsafe {
-/// MaybeUninit::uninit().assume_init()
-/// };
+/// // Create an uninitialized array of `MaybeUninit`.
+/// let mut data: [MaybeUninit<Vec<u32>>; 1000] = MaybeUninit::uninit().transpose();
///
/// // Dropping a `MaybeUninit` does nothing, so if there is a panic during this loop,
/// // we have a memory leak, but there is no memory safety issue.
@@ -133,25 +130,23 @@
/// elem.write(vec![42]);
/// }
///
-/// // Everything is initialized. Transmute the array to the
-/// // initialized type.
-/// unsafe { mem::transmute::<_, [Vec<u32>; 1000]>(data) }
+/// // Everything is initialized. Convert the array to the initialized type.
+/// unsafe { MaybeUninit::<[Vec<_>; 1000]>::assume_init(data.transpose()) }
/// };
///
-/// assert_eq!(&data[0], &[42]);
+/// assert_eq!(&data[42], &[42]);
/// ```
///
/// You can also work with partially initialized arrays, which could
/// be found in low-level datastructures.
///
/// ```
/// use std::mem::MaybeUninit;
/// use std::ptr;
///
-/// // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
-/// // safe because the type we are claiming to have initialized here is a
-/// // bunch of `MaybeUninit`s, which do not require initialization.
-/// let mut data: [MaybeUninit<String>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
+/// // Create an uninitialized array of `MaybeUninit`.
+/// let mut data: [MaybeUninit<String>; 1000] = MaybeUninit::uninit().transpose();
/// // Count the number of elements we have assigned.
/// let mut data_len: usize = 0;
///
Should
MaybeUninit::uninit_array::<LEN>()
be stabilised if it can be replaced by[const { MaybeUninit::uninit() }; LEN]
?
I’d like to argue for: yes. Stabilizing now can allow users to remove some unsafe
blocks soon, whereas we don’t know how long it’s gonna be until const {…}
blocks are stable and powerful enough for this use case. If and when they eventually are, we’d end up with two ways of doing the same thing but that’s not really harmful. At that point we can soft-deprecate the constructor by pointing out the new pattern in its docs. (Emitting an actual deprecation warning will not worth be the churn IMO but will be a possibility.)
What other APIs should be added for arrays?
New proposals can be made at any time. The methods already in Nightly don’t need to be blocked on exhausting the design space for other thematically-related methods.
What's the rationale behind separate feature flag for const? Is there any reason that, if we stabilise this function, we want it to be non-const?
I think we should merge the feature flags and stabilise it in one go.
I don't like name of this method. transpose
should be reserved for actual matrix transpose
I don't like name of this method.
transpose
should be reserved for actual matrix transpose
That ship has already sailed with Option::transpose and Result::transpose. In this case, I feel it's somewhat unlikely that people doing high-level matrix operations would need to use the low-level MaybeUninit
at the same time.
Here's the original discussion for transpose
naming options: https://github.com/rust-lang/rust/issues/47338#issuecomment-450529909
Inline const expressions are said to render this method less necessary, but such expressions aren't even necessary for Copy
types on stable Rust today:
use std::mem::{MaybeUninit, transmute};
fn main() {
let mut arr = [MaybeUninit::uninit(); 4];
for i in 0..arr.len() {
unsafe { *arr[i].as_mut_ptr() = i as u8; }
}
// Prints "[0, 1, 2, 3]"
println!("{:?}", unsafe { transmute::<_, [u8; 4]>(arr) });
}
What's then the point of this method for these cases? Does it have any codegen advantage? I checked on godbolt.org and the generated assembly for this code didn't initialize the array elements. (Edit: even after tweaking this code to initialize the array positions to std::env::args().count()
so that the compiler could not get too smart I couldn't get it to emit initialization code for the array positions.)
Edit: given that for non-Copy
types you would need to resort to nightly features anyway, I'd prefer to just use inline const expressions instead, but there is some digression on this opinion.
Edit 2: actually, it looks like even for non-Copy
types you don't need to resort to nightly features when using the inline-const
crate.
Because it labels the action you're intending to do, without the reader having to puzzle out "oh, this time the transmute was to initialize the data within the type system".
Exactly like f32::to_bits and Option::unwrap and dozens of other small helper functions. Putting the name on the action is the value, not because there's better codegen than you could write yourself.
Thanks - I should clarify that I was referring to uninit_array
in my comment, but I indeed see some readability advantages for adding array_assume_init
.
Ah, with uninit_array
there is a small but temporary benefit other than the "name the action" argument. SimonSapin mentioned it a little bit up in the thread: Currently the array size of an output array can be inferred but the length of an array expression cannot.
Another thing which is arguably inconsequential is that the intent is a bit more clear: even though I know that the compiler isn't really copying the same uninitialised data N times into the array, it feels better to explicitly say "I want an uninitialised array" rather than "I want to initialise an array with each element uninitialised."
This is actually a reason why I argue that MaybeUninit<[T; N]>
being easier to use feels more natural, although due to concerns with how MaybeUninit<[T]>
would work (it doesn't at all right now), this isn't really being pursued.
I believe [MaybeUninit::uninit(); 4]
didn't work yet when adding uninit_array
, it wasn't about "more explicit".
It always worked for Copy
types. const
blocks are needed for non-Copy
types.
Does array_assume_init
add much value over using map
? Godbolt says they are optimized identically.
As for uninit_array
, making MaybeUninit
unconditionally Copy
(or Default
) would address that, though that would have its own issues (#62835).
Ah, that's quite clean! Can't believe I didn't think of that. I haven't checked, but assuming this works in all scenarios, I'd be in favor of killing the methods in this tracking issue and the transpose methods I added in favor of const initialization (which needs to be stabilized first) and mapping initialization.
Just realized, uninit_array
can also be written with array::from_fn
.
Replacing u64
with a struct (a more realistic thing to use with MaybeUninit
makes that false: https://rust.godbolt.org/z/o6ecsxjGa
Replacing
u64
with a struct (a more realistic thing to use withMaybeUninit
makes that false: rust.godbolt.org/z/o6ecsxjGa
Dang, that's disappointing.
Just realized,
uninit_array
can also be written witharray::from_fn
.
I'm glad you posted this, it's much nicer than the [(); N].map(|_| MaybeUninit::uninit())
that I was just writing :joy:
I agree that it is possible to obtain an uninitialized array other ways, but uninit_array
is the most obvious way to do it, and it's the cleanest solution I've found too. Aside from that, what's blocking stabilization?
I am interested in <&MaybeUninit<[T; N]>>::transpose()
and <&mut MaybeUninit<[T; N]>>::transpose()
. Specifically this is helpful when allocating Box<[T; N]>
where [T; N]
is too large to fit on the stack, it would make it possible to use with iterators.
Would it be considered related to this issue?
@rust-lang/libs-api: @rfcbot fcp close
(Only in regard to MaybeUninit::uninit_array
, not the other unstable APIs still tracked by this issue.)
I propose that we accept https://github.com/rust-lang/rust/pull/125082 to remove MaybeUninit::uninit_array()
in favor of having callers use [MaybeUninit::uninit(); N]
and [const { MaybeUninit::uninit() }; N]
. (The const block is required for T: !Copy
.)
When uninit_array
was originally introduced, it was useful because it was a safe wrapper around an unsafe implementation: unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() }
. The best equivalent safe way to express what this does used to be significantly more verbose:
fn workaround<T, const N: usize>() -> [MaybeUninit<T>; N] {
trait Workaround: Sized {
const UNINIT: MaybeUninit<Self>;
}
impl<T> Workaround for T {
const UNINIT: MaybeUninit<Self> = MaybeUninit::uninit();
}
[<T as Workaround>::UNINIT; N]
}
These days, with const {…}
expressions stabilizing in Rust 1.79 (https://github.com/rust-lang/rust/pull/104087), the justification for a dedicated uninit_array
is weaker and limited to convenience and discoverability.
My opinion aligns with @kpreid's characterization in the PR description:
The only remaining question is whether it is an important enough convenience to keep it around.
I believe it is net good to remove this function, on the principle that it is better to compose two orthogonal features (
MaybeUninit
and array construction) than to have a specific function for the specific combination, now that that is possible.
The counter perspective is the one in https://github.com/rust-lang/rust/pull/125082#issuecomment-2108313867.
I still prefer
MaybeUninit::uninit_array()
to[const { MaybeUninit::uninit() }; N]
, it's shorter and more readable IMHO.
I do not dispute that it is 25% fewer characters.
For the common case of copyable contents like an uninit u8 buffer, [MaybeUninit::uninit(); N]
is even shorter than that.
Regarding readability, the comparison assumes one has read and retained this part of https://doc.rust-lang.org/std/mem/union.MaybeUninit.html, which is not free. Superseding a part of the extensive, ad-hoc API of MaybeUninit
with better composable semantics is probably good for readability.
For discoverability, I'd expect [MaybeUninit::uninit(); N]
is more discoverable than MaybeUninit::uninit_array()
. From there, a compiler diagnostic can hint to add const
if dealing with a non-Copy element type.
@rfcbot fcp close
Team member @dtolnay has proposed to close this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
I realized I'll need to file a compiler diagnostics bug. Currently on nightly:
use std::mem::MaybeUninit;
fn main() {
let _: [MaybeUninit<String>; 2] = [MaybeUninit::uninit(); 2];
}
error[E0277]: the trait bound `String: Copy` is not satisfied
--> src/main.rs:4:40
|
4 | let _: [MaybeUninit<String>; 2] = [MaybeUninit::uninit(); 2];
| ^^^^^^^^^^^^^^^^^^^^^ the trait `Copy` is not implemented for `String`, which is required by `MaybeUninit<String>: Copy`
|
= note: required for `MaybeUninit<String>` to implement `Copy`
= note: the `Copy` trait is required because this value will be copied for each element of the array
= help: create an inline `const` block, see RFC #2920 <https://github.com/rust-lang/rfcs/pull/2920> for more information
help: consider creating a new `const` item and initializing it with the result of the function call to be used in the repeat position
|
4 ~ const ARRAY_REPEAT_VALUE: MaybeUninit<String> = MaybeUninit::uninit();
5 ~ let _: [MaybeUninit<String>; 2] = [ARRAY_REPEAT_VALUE; 2];
|
We should change this to suggest [const { MaybeUninit::uninit() }; 2]
, instead of the current suggestion which is:
const ARRAY_REPEAT_VALUE: MaybeUninit<String> = MaybeUninit::uninit();
let _: [MaybeUninit<String>; 2] = [ARRAY_REPEAT_VALUE; 2];
:bell: This is now entering its final comment period, as per the review above. :bell:
Is the intention that a new tracking issue be opened for the remaining methods?
A FCP to close doesn't mean that the issue has to be actually closed after it completes. dtolnay wrote
(Only in regard to MaybeUninit::uninit_array, not the other unstable APIs still tracked by this issue.)
The final comment period, with a disposition to close, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
This should be closed since inline const blocks has already landed in stable right? Or is that only part of this issue? Just going thru MaybeUninit and seeing the new uninit array method is still on nightly
This is a meta-tracking issue for multiple APIs that are linked across multiple issues. Right now it only includes two methods, but since there seems to be a desire to add more, this issue can be used as a placeholder for those discussions until those methods are added.
Public API
Steps / History
Relevant Links
Unresolved Questions
MaybeUninit::uninit_array::<LEN>()
be stabilised if it can be replaced by[const { MaybeUninit::uninit() }; LEN]
?array_assume_init
the right pattern, or should we convert from[MaybeUninit<T>, N]
back toMaybeUninit<[T; N]>
first?