rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.69k forks source link

"incorrect use of uninit memory" lint should fire for multi-variant enum #73608

Closed RalfJung closed 4 years ago

RalfJung commented 4 years ago

I tried this code:

#[derive(Copy, Clone, Debug)]
enum Fruit {
    Apple,
    _Banana,
}

fn foo() -> Fruit {
    unsafe {
        let mut r = mem::uninitialized();
        ptr::write(&mut r as *mut Fruit, Fruit::Apple);
        r
    }
}

I expected to see this happen: It should warn about Fruit not working with mem::uninitialized.

Instead, this happened: There is no diagnostic.

The problem here is that not all enums should cause the warning. If the enum only has one variant, being uninitialized could actually be fine (but variants that are uninhabited and ZST do not count). This is easy to check at run-time where we have the full layout (so we panic), but statically we just have the type and that makes it harder.

As a start, we could count the fieldless variants; if there are at least 2 then the enum definitely needs a tag and thus may not remain uninitialized.

Cc @stepancheg

lcnr commented 4 years ago

I was shortly confused why one variant enums can be okay, so here's a short example:

enum Test {
    Variant(u8),
}

Note that we should also not warn on an enum with #[repr(uN)] and 2^N variants. https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=425e29d79a26d57552d35a212126552b

RalfJung commented 4 years ago

I was shortly confused why one variant enums can be okay, so here's a short example:

To be clear, this is not a guarantee. It's just how we currently represent enums. It might become UB in the future to leave such an enum uninit. Right now, it is a case of incorrectly relying on unspecified data layout.

Note that we should also not warn on an enum with #[repr(uN)] and 2^N variants.

That is not correct, we should warn for such cases. Uninitialized memory is different from memory having some arbitrary value. So when we say "the tag has to encode a valid discriminant", then even if every initialized bit pattern encodes a valid discriminant, that does not mean that the uninitialized bit pattern encodes a valid discriminant.

RalfJung commented 4 years ago

To be clear, this is not a guarantee. It's just how we currently represent enums. It might become UB in the future to leave such an enum uninit. Right now, it is a case of incorrectly relying on unspecified data layout.

In light of this, I wonder if we should just warn about all enums, to avoid relying on such unspecified details? However that would mean that we'd warn about cases that are not UB... so we should likely at least make the message different.

lcnr commented 4 years ago

Don't both cases have the same reasoning though? I don't quite see why you do not want to warn on enum Test { Variant(u8) } while warning on 2^N variants, considering that both rely on (very similar) unspecified behavior.

This is easy to check at run-time where we have the full layout (so we panic)

Afaict we check this during code-gen, not at run-time. So we could emit post monomorphization warnings here, which we probably do not want, e.g.

use std::{mem, ptr};

unsafe trait Foo {
    const IS_POD: bool;
}

unsafe impl Foo for bool {
    const IS_POD: bool = false;
}

unsafe impl Foo for u8 {
    const IS_POD: bool = true;
}

fn bar<T: Foo + Default>() -> T {
    unsafe {
        if T::IS_POD {
            let mut x = mem::uninitialized();
            ptr::write(&mut x, T::default());
            x
        } else {
            T::default()
        }
    }
}

fn main() {
    let _: bool = bar();
    let _: u8 = bar();
}
lcnr commented 4 years ago

To be clear, this is not a guarantee. It's just how we currently represent enums. It might become UB in the future to leave such an enum uninit. Right now, it is a case of incorrectly relying on unspecified data layout.

In light of this, I wonder if we should just warn about all enums, to avoid relying on such unspecified details? However that would mean that we'd warn about cases that are not UB... so we should likely at least make the message different.

Isn't enum Foo { Variant(u8) } also UB except that it currently can't be triggered?

There is obviously a somewhat important difference between theoretical and "actual" UB here, which is probably also the line we draw when deciding if the call itself should panic. IMO we should warn for anything which is "theoretically" UB while panicking in cases which are currently known to sometimes to cause problems.

RalfJung commented 4 years ago

Afaict we check this during code-gen, not at run-time. So we could emit post monomorphization warnings here, which we probably do not want, e.g.

We do, sorry for the vagueness. But indeed post-monomorphization time errors would likely not even show up in the right crate here, and they would break proper run-time tests as you noted.

Don't both cases have the same reasoning though? I don't quite see why you do not want to warn on enum Test { Variant(u8) } while warning on 2^N variants, considering that both rely on (very similar) unspecified behavior.

No, it's very different. Your single-variant enum is (under current rustc layout) equivalent to u8 itself. It is right that uninitialized u8 is currently deemed UB by the reference, but whether or not this is really what we want is discussed in https://github.com/rust-lang/unsafe-code-guidelines/issues/71.

For enums with a tag, I have not seen anyone suggest to permit them being uninitialized under any circumstances. It would we an awful special case in our semantics, and I don't see any reason to do this.

IMO we should warn for anything which is "theoretically" UB

That is not currently our approach e.g. for uninitialized u8.