rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.82k stars 12.66k forks source link

Compilation error when matching reference to empty enum #131452

Open dhedey opened 1 week ago

dhedey commented 1 week ago

[!NOTE] Useful background links which have been shared in comments:

Summary

I tried this code:

enum A {}

fn f(a: &A) -> usize {
    match a {}
}

I expected to see this happen: This compiles successfully.

Instead, this happened: (E0004 compiler error, expandable below)

E0004 Compiler Error

```text error[E0004]: non-exhaustive patterns: type `&A` is non-empty --> src/lib.rs:4:11 | 4 | match a {} | ^ | note: `A` defined here --> src/lib.rs:1:6 | 1 | enum A {} | ^ = note: the matched value is of type `&A` = note: references are always considered inhabited help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern as shown | 4 ~ match a { 5 + _ => todo!(), 6 + } | For more information about this error, try `rustc --explain E0004`. error: could not compile `playground` (lib) due to 1 previous error ```

Searching for "Rust E0004" links to docs that don't explain why references to uninhabited types need to be matched: https://doc.rust-lang.org/error_codes/E0004.html - you have to search for "references are always considered inhabited" which takes you to this issue from 2020 - more on this history in the Background section below.

Motivating example

This comes up commonly when creating macros which generate enums, e.g. a very simplified example (playground link):

macro_rules! define_enum {
    (
        $name: ident,
        [
            $(
                $variant: ident => $string_label: expr
            ),* $(,)?
        ]$(,)?
    ) => {
        enum $name {
            $($variant,)*
        }

        impl $name {
            fn label(&self) -> &'static str {
                match self {
                    $(Self::$variant => $string_label,)*
                }
            }
        }
    }
}

// This compiles
define_enum!{
    MyEnum,
    [Foo => "foo", Bar => "bar"],
}

// This fails compilation with E0004
define_enum!{
    MyEmptyEnum,
    [],
}
Compiler Error

```text error[E0004]: non-exhaustive patterns: type `&MyEmptyEnum` is non-empty --> src/lib.rs:16:23 | 16 | match self { | ^^^^ ... 31 | / define_enum!{ 32 | | MyEmptyEnum, 33 | | [], 34 | | } | |_- in this macro invocation | note: `MyEmptyEnum` defined here --> src/lib.rs:32:5 | 32 | MyEmptyEnum, | ^^^^^^^^^^^ = note: the matched value is of type `&MyEmptyEnum` = note: references are always considered inhabited = note: this error originates in the macro `define_enum` (in Nightly builds, run with -Z macro-backtrace for more info) help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern as shown | 16 ~ match self { 17 + _ => todo!(), 18 + } | For more information about this error, try `rustc --explain E0004`. error: could not compile `playground` (lib) due to 1 previous error ```

The fact that this doesn't work for empty enums is quite a gotcha, and I've seen this issue arise a few times as an edge case. In most cases, it wasn't caught until a few months after, when someone uses the macro to create an enum with no variants.

But why would we ever use such a macro to generate empty enums? Well this can fall out naturally when generating a hierarchy of enums, where some inner enums are empty, e.g.

define_namespaced_enums! {
    Foo => {
        Fruit: [apple, pear],
        Vegetables: [],
    }
    Bar => {
        Fruit: [banana],
        Vegetables: [carrot],
    }
}

Workarounds

Various workarounds include:

Background

This was previously raised in this issue: https://github.com/rust-lang/rust/issues/78123 but was closed as "expected behaviour" - due to the fact that:

... the reason that &Void (where Void is an uninhabited type) is considered inhabited is because unsafe code can construct an &-reference to an uninhabited type (e.g. with ptr::null() or mem::transmute()). I don't know enough to be sure though.

However, when I raised this as a motivating example in https://github.com/rust-lang/unsafe-code-guidelines/issues/413#issuecomment-2399672350, @RalfJung suggested I raise a new rustc issue for this, and that actually the match behaviour is independent of the UB semantics decision:

So please open a new issue (in the rustc repo, not here) about the macro concern. That's a valid point, just off-topic here. Whether we accept that match code is essentially completely independent of what we decide here.

Opinions

This was written before reading this blog post where the suggested "auto-never" rules capture these intuitions/opinions much better than I ever could.

After learning about the ! pattern, I find it quite surprising that we require an explicit ! pattern and that match self { } doesn’t in itself capture the same intention as match self { ! }.

Ideally if we change the behaviour of this, I’d also expect the following to compile:

fn optional_match(x: Option<&Self>) {
    match x {
        None => {}
    }
}

And not require a Some(!) pattern.

Meta

rustc --version --verbose:

rustc 1.81.0 (eeb90cda1 2024-09-04)
binary: rustc
commit-hash: eeb90cda1969383f56a2637cbd3037bdf598841c
commit-date: 2024-09-04
host: aarch64-apple-darwin
release: 1.81.0
LLVM version: 18.1.7

Also works on nightly.

RalfJung commented 1 week ago

For more background (but partially outdated), also see this blog post by @nikomatsakis.

Cc @rust-lang/opsem @rust-lang/types @Nadrieril

My comment referred to the fact that the safety invariant is uncontroversial in your example, and sufficient to ensure that without unsafe code, nothing weird can happen -- in the specific case considered here, where the match is on a reference.

There are related, more subtle cases though. For instance, what if ptr: *const !/*const Void, and now we do match *ptr {}? Such a pointer can be constructed in safe code, and let _ = *ptr; is not UB and might even be(come) safe. Similarly, match *ptr { _ => () } is not UB. But with an empty match arm now suddenly this is elevated to UB, even though no safety invariant was violated going into the match. IMO that's a footgun we want to avoid: the absence of code introduces UB into a program that doesn't ever obviously violate any validity invariant. (Constructing a place of type ! is fine. It's reading the discriminant that's a problem, but whether and when a match reads the discriminant depends on many factors, and in this case it is strictly the absence of a match arm that is the only thing this code did wrong.)

Also for references, we may allow code to temporarily break the &Enum safety invariant (by not making it a validity invariant for the discriminant to be valid). Not sure if we should be concerned about that, but it is something to consider.

So my position on this heavily depends on whether the place we are matching on is "safe" or not. A place is "unsafe" if it involves * on a raw pointer or a union field. For unsafe places, I think we need to be careful. I think we want to introduce ! patterns, not accept anything we don't already accept, and lint against existing code like this to enforce the use of never patterns any time "empty type" reasoning is done on an unsafe place.

For safe places, I am much less concerned. Maybe we want an opt-in lint that recommends never patterns to make this explicit. Maybe we want the lint to be warn-by-default when there is unsafe code nearby? But that seems hard and we don't have any other lint like this.

There's also the question of irrefutable patterns exploiting empty types, such as let OK(foo) = place. There we can't suggest never patterns as that would destroy the entire point of these patterns. So IMO we should not do empty type reasoning here for unsafe places, i.e. this and this should keep being rejected. Though arguably by writing Ok(_) the programmer did explicitly request a discriminant to be read, and Result<(), !> simply doesn't have a discriminant for its Err variant, so this is less bad in my view than the match *ptr {} case.

ijackson commented 1 week ago

Can't we make a distinction between unsafe and safe places and auto-scrutinise the discriminant of safe places?

IOW declare that &Void is unsound and say that *Void is allowed but you can't do the empty match on it.

RalfJung commented 1 week ago

Yes, that is basically what I am suggesting. It is a new distinction between two kinds of places that I don't think we already have, so it's not an entirely trivial extension of the status quo.

compiler-errors commented 1 week ago

It is a new distinction between two kinds of places that I don't think we already have, so it's not an entirely trivial extension of the status quo.

Well, since min_exhaustive_patterns was stabilized, we considers some places to be "exhaustive" and some not to be, depending on their expression and their type... so we kinda already do this, at least in the THIR.

Nadrieril commented 1 week ago

+1 to most of what Ralf just said. I'm in favor of making the & case convenient at the cost of footguns for users of unsafe references (that's full of footguns already anyway).

Result<(), !> simply doesn't have a discriminant for its Err variant

As far as I remember we hadn't yet committed to that, at least in more complex cases like Result<(), (!, T)>.

let OK(foo) = place. There we can't suggest never patterns

We can suggest let (Ok(foo) | Err(!)) = place.