Please allow to skip checks with unsafe methods

glandium commented 5 years ago

It would be useful to have unsafe methods that allow to get to the underlying values in OnceCell and Lazy, that assume initialization happened. A use case would be to avoid the initialization check when you know by other means that initialization did happen.

matklad commented 5 years ago

Yeah, something like this would make sense:

/// Safety: the cell must be in initialized state. 
unsafe fn get_unchecked(&self) -> &T

Could you give a specific example where this is useful though? My gut feeling is that, to use this function safely, you need to prove that initialization indeed has happened. A natural proof would be the &T you get from calling the initialization function. However, if you can thread a &T to the get_unchecked callsite, you don't need get_unchecked! And if you can't thread the &T, there's a high chance that the cell might not be initialized.

glandium commented 5 years ago

This is the kind of thing I'm thinking about (modulo errors, and the currently non-existing method):

pub struct Foo {
    ...
}
#[derive(Copy, Clone)]
pub struct FooHandle(());
impl Foo {
    fn new() -> Self {
        ...
    }
    fn handle(&'static self) -> FooHandle {
        // Only way to get a FooHandle is here, which means we got a ref, or via Copy of another FooHandle.
        let result = FooHandle(());
        assert_eq!(self as *const _, result.get() as *const _);
        result
    }
}
static FOO: Lazy<Foo> = Lazy::new(|| Foo::new());
impl FooHandle {
    fn get(self) -> &'static Foo {
        unsafe { Lazy::get_unchecked(&FOO) };
    }
}

matklad commented 5 years ago

Thanks for the example! I agree that this is something which can't be expressed using only safe code, without compromising elsewhere.

Specifically, FooHandle can be defined as FooHandle(&'static Foo), and that would make this pattern fully-safe, at the expense of making FooHandle a non-ZST type. If we want a ZST, we need get unchecked. BTW, another way to define FooHandle is FooHandle(PhantomData<&'static Foo>), which maybe is better with respect to auto-traits.

My plan is to add OnceCell::get_unchecked but not Lazy::get_unchecked: Lazy is a purely convenience type, and I fear that adding unsafe methods to it might lead to misuse. So, the example would look like this:

pub struct Foo { ... }

#[derive(Copy, Clone)]
pub struct FooHadle(PhantomData<&'static Foo>);

static INSTANCE: sync::OnceCell<Foo> = sync::OnceCell::new();

impl Foo {
    fn new() -> Self { ... }
    fn new_handle() -> FooHandle {
        INSTANCE.get_or_init(Foo::new);
        FooHandle(PhantomData)
    }
}

impl FooHandle {
    fn get(self) -> &'static Foo {
        unsafe { INSTANCE.get_unchecked() }
    }
}

glandium commented 5 years ago

Heh, I oversimplified my actual use-case, which does, in fact, have PhantomData<&'static Foo>.

I do agree adding such a function to Lazy might be too footgun-y.

KamilaBorowska commented 5 years ago

Hm, cannot .get().unwrap_or_else(|| std::hint::unreachable_unchecked()) be used for that purpose? I think LLVM should be able to optimize that well considering it should realize the is_initialized read is unnecessary, as it being false will cause UB.

I'm against introducing footguns, because something you may think the value is always initialized when it turns out it isn't. You should think a lot about using get_unchecked I feel like.

For instance, the following code:

pub unsafe fn example(y: once_cell::sync::OnceCell<i32>) -> i32 {
    *y.get()
        .unwrap_or_else(|| std::hint::unreachable_unchecked())
}

Seems to compile to this (with std feature):

    mov rax, qword ptr [rdi]
    mov eax, dword ptr [rdi + 12]
    ret

Which is pretty much optimal (considering the data layout of sync variant).

The parking-lot variation of sync generates this.

    mov al, byte ptr [rdi + 9]
    mov eax, dword ptr [rdi + 4]
    ret

Meanwhile the unsync creates the following code:

    mov eax, esi
    ret

Which is also optimal.

matklad commented 5 years ago

One would think so, but apparently LLVM fails to optimize away a load: https://github.com/matklad/once_cell/pull/60#issuecomment-531489557

matklad commented 5 years ago

released as 1.2.0

matklad / once_cell

Please allow to skip checks with unsafe methods #59