rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
667 stars 58 forks source link

Layout and behavior of enums with uninhabited fields in some variants #443

Open RalfJung opened 1 year ago

RalfJung commented 1 year ago

This is about enums like the following:

enum E1 { A, B(!) }
enum E2 { A, B(!), C(i32, !) }

The first interesting point is that E1 actually will not get any space assigned for storing a discriminant, it has size 0. The algorithm that assigns discriminants entirely skips B. In Miri we had to make SetDiscriminant throw UB early if the requested variant to set is uninhabited, since otherwise we get ICEs later.

However, E2 has size 8, even though there is only a single valid variant and that variant has size 0. I think we currently always provide storage for all fields of all variants, even if they are uninhabited. This is useful because in theory it lets us compile E2::C(f(), panic!()) into something that does in-place initialization:

let val: E2;
val.C.0 = f();
panic!();
SetDiscriminant(val, C); // this would be UB but we don't get here

(AFAIK we currently don't actually do that, we introduce temporaries instead.)

For structs we have decided that we will always have storage for all fields, and this is unavoidable since safe code can partially initialize a struct. However, safe code cannot partially initialize an enum, so we could soundly decide that E2 has size 0. We have to decide between smaller enums and in-place initialization for arbitrary enums. (We can of course have small enums and then have an analysis that uses in-place initialization where possible -- but in generic MIR, it might not be possible to tell whether the variant we are about to initialize is inhabited.)

Nadrieril commented 3 weeks ago

Is it guaranteed (and documented somewhere) that ReadDiscriminant cannot return the discriminant of an uninhabited variant (on pain of UB)? Is that true even of repr(whatever) enums?

RalfJung commented 3 weeks ago

That is the current behavior as implemented in Miri, but I don't think we guarantee it. ReadDiscriminant is anyway more of an implementation detail, it doesn't show up directly in the surface syntax.

Nadrieril commented 3 weeks ago

Do we not specify what data a pattern-match accesses?

My real question is: is there or will there ever be a UB-free execution that can reach a match arm that matches on an uninhabited variant. Such an execution could look like the following (I'm assuming this example is fine for e.g. T=u8, do tell me if I'm wrong):

#![feature(never_type)]

#[repr(u8)]
enum Enum<T> {
    A = 0,
    B = 1,
    C(u8, T) = 2,
}

fn main() {
    let mut x: Enum<!> = Enum::A;
    unsafe { (&raw mut x).cast::<u8>().write(2u8) };
    match x {
        Enum::A => println!("got A!"),
        Enum::B => println!("got B!"),
        Enum::C(..) => println!("got C!"), // is this ever reachable?
    }
}
RalfJung commented 3 weeks ago

Do we not specify what data a pattern-match accesses?

We specify basically nothing here. It's all details of how exactly match gets lowered to MIR / MiniRust, and that isn't really specified in any way.

Personally I think it's reasonable to say that GetDiscriminant will never return the discriminant of an uninhabited enum. Though this is yet another case of special-casing uninhabited types in the semantics (similar to https://github.com/rust-lang/unsafe-code-guidelines/issues/413 in that sense), which not everyone is happy with. And for repr(C) enums it is particularly tricky since we make so many layout guarantees about them.