Closed glandium closed 5 years ago
Yeah, something like this would make sense:
/// Safety: the cell must be in initialized state.
unsafe fn get_unchecked(&self) -> &T
Could you give a specific example where this is useful though? My gut feeling is that, to use this function safely, you need to prove that initialization indeed has happened. A natural proof would be the &T
you get from calling the initialization function. However, if you can thread a &T
to the get_unchecked
callsite, you don't need get_unchecked
! And if you can't thread the &T
, there's a high chance that the cell might not be initialized.
This is the kind of thing I'm thinking about (modulo errors, and the currently non-existing method):
pub struct Foo {
...
}
#[derive(Copy, Clone)]
pub struct FooHandle(());
impl Foo {
fn new() -> Self {
...
}
fn handle(&'static self) -> FooHandle {
// Only way to get a FooHandle is here, which means we got a ref, or via Copy of another FooHandle.
let result = FooHandle(());
assert_eq!(self as *const _, result.get() as *const _);
result
}
}
static FOO: Lazy<Foo> = Lazy::new(|| Foo::new());
impl FooHandle {
fn get(self) -> &'static Foo {
unsafe { Lazy::get_unchecked(&FOO) };
}
}
Thanks for the example! I agree that this is something which can't be expressed using only safe code, without compromising elsewhere.
Specifically, FooHandle
can be defined as FooHandle(&'static Foo)
, and that would make this pattern fully-safe, at the expense of making FooHandle
a non-ZST type. If we want a ZST, we need get unchecked. BTW, another way to define FooHandle
is FooHandle(PhantomData<&'static Foo>)
, which maybe is better with respect to auto-traits.
My plan is to add OnceCell::get_unchecked
but not Lazy::get_unchecked
: Lazy
is a purely convenience type, and I fear that adding unsafe methods to it might lead to misuse. So, the example would look like this:
pub struct Foo { ... }
#[derive(Copy, Clone)]
pub struct FooHadle(PhantomData<&'static Foo>);
static INSTANCE: sync::OnceCell<Foo> = sync::OnceCell::new();
impl Foo {
fn new() -> Self { ... }
fn new_handle() -> FooHandle {
INSTANCE.get_or_init(Foo::new);
FooHandle(PhantomData)
}
}
impl FooHandle {
fn get(self) -> &'static Foo {
unsafe { INSTANCE.get_unchecked() }
}
}
Heh, I oversimplified my actual use-case, which does, in fact, have PhantomData<&'static Foo>
.
I do agree adding such a function to Lazy
might be too footgun-y.
Hm, cannot .get().unwrap_or_else(|| std::hint::unreachable_unchecked())
be used for that purpose? I think LLVM should be able to optimize that well considering it should realize the is_initialized
read is unnecessary, as it being false
will cause UB.
I'm against introducing footguns, because something you may think the value is always initialized when it turns out it isn't. You should think a lot about using get_unchecked
I feel like.
For instance, the following code:
pub unsafe fn example(y: once_cell::sync::OnceCell<i32>) -> i32 {
*y.get()
.unwrap_or_else(|| std::hint::unreachable_unchecked())
}
Seems to compile to this (with std
feature):
mov rax, qword ptr [rdi]
mov eax, dword ptr [rdi + 12]
ret
Which is pretty much optimal (considering the data layout of sync
variant).
The parking-lot variation of sync generates this.
mov al, byte ptr [rdi + 9]
mov eax, dword ptr [rdi + 4]
ret
Meanwhile the unsync
creates the following code:
mov eax, esi
ret
Which is also optimal.
One would think so, but apparently LLVM fails to optimize away a load: https://github.com/matklad/once_cell/pull/60#issuecomment-531489557
released as 1.2.0
It would be useful to have unsafe methods that allow to get to the underlying values in OnceCell and Lazy, that assume initialization happened. A use case would be to avoid the initialization check when you know by other means that initialization did happen.