rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.94k stars 12.68k forks source link

Tracking issue for RFC 1872: `exhaustive_patterns` feature #51085

Open kennytm opened 6 years ago

kennytm commented 6 years ago

This tracks the exhaustive_patterns feature which allows uninhabited variant to be omitted (bug report: #12609; relevant RFC: rust-lang/rfcs#1872).

fn safe_unwrap<T>(x: Result<T, !>) -> T {
    match x {
        Ok(y) => y,
    }
}
varkor commented 6 years ago

The lack of this feature has unintuitive consequences (e.g. https://github.com/rust-lang/rust/issues/55123). I haven't seen any problems arising from it either.

cc @rust-lang/lang: could this be put forward for stabilisation?

nikomatsakis commented 6 years ago

@varkor I am mildly reluctant until we have agreed upon the story around "never patterns" -- but I suppose given a sufficiently conservative definition of "uninhabited pattern" I'd be all right with it. Do you think you could write-up a "stabilization report" that includes examples of the tests we have and also documents the limits of what sorts of things would be considered uninhabited in stable code?

(That said, I think "never patterns" gives us a path forward that addresses my concerns around aggressive uninhabited definitions, by just turning the "ambiguous" cases into lints in the presence of unsafe code. I'd be happier though if we had an RFC on the topic.)

nikic commented 5 years ago

The exhaustive_patterns feature currently seems to have a pretty big performance issue. Having perf top running while compiling librustc, I noticed that it spends an unreasonable amount of time inside ty::inhabitedness. Disabling exhaustive_patterns (it's not actually used in librustc/librustc_mir) shaved off ~30s (of 12min total) from stage1 compiler artifact compile time (in a totally unscientific benchmark, so take with a grain of salt).

I would expect that this issue will resolve itself if there is a consensus to go with references never being uninhabited (ref https://github.com/rust-lang/rust/pull/54125#issuecomment-433304976), as the reference handling is what I believe makes this expensive (not looking through references both reduces the amount of types we need to look at, and allows us to drop the expensive hashmap and nested hashsets used to avoid unbounded recursion). If the consensus goes in the other direction, then this code should probably be optimized a bit prior to stabilization of the feature.

jonas-schievink commented 5 years ago

cc https://github.com/rust-lang/rust/issues/51221, using this feature in let-bindings seems to ICE unless the never_type feature is also enabled

Nadrieril commented 3 years ago

The exhaustive_patterns feature currently seems to have a pretty big performance issue. Having perf top running while compiling librustc, I noticed that it spends an unreasonable amount of time inside ty::inhabitedness.

It is now the case that "references never are uninhabited", at least as far as is_ty_uninhabited_from is concerned. Is the performance issue still there? Otherwise, I expect that making is_ty_uninhabited_from into a query or something would significantly improve perf, because match checking repeatedly calls it on the same type and its subcomponents.

varkor commented 3 years ago

Is the performance issue still there?

I suspect this requires some investigation. (Perhaps it would be sufficient to open a PR enabling the feature and do a perf run.)

Nadrieril commented 3 years ago

So, apart from the perf issue (which is being worked on), the blocker seems to be the never patterns story that @nikomatsakis mentioned. I'll try to summarize the issue and the current status of this feature. Hopefully this can serve as a partial stabilization report.

Unsafe makes more things reachable

The issue is that in unsafe code we may come across patterns that would be unreachable in safe code but could actually be reachable in unsafe code. The two examples @nikomatsakis mentions are &! references (since such a reference could e.g. point to uninitialized memory) and partially-initialized structs/tuples. To illustrate the first case:

let x: Option<&!> = foo();
match x {
    Some(x) => {
        // This branch might be accessible if we create the reference unsafely.
    }
    None => {}
}

Here we might legitimately get such a reference and want to operate on it in unsafe code, so we don't want the branch to be unreachable. To illustrate the second case (playground):

union Uninit<T> {
  value: ManuallyDrop<T>,
  uninit: (),
}

fn main() {
  unsafe {
    let mut x: Uninit<(u32, !)> = Uninit { uninit: () };
    (*x.value).0 = 22; // Initialize first part of the tuple
    match *x.value {
      (v, _) => { // This is reachable and apparently not UB
        println!("{}", v);
      }
    }
  }
}

Since those branches are reachable, we really musn't allow the user to omit them. If we did, the following would compile and be unsound:

unsafe {
    let x: &! = foo();
    let n: u64 = match Some(x) {
        None => 42,
    };
    // now `n` contains UB garbage
}

Yet in both those cases, we might legitimately want to treat those branches as unreachable in safe code. Typically we may want the following to compile:

fn safe_unwrap<T>(x: &Result<T, !>) -> &T {
    match x.as_ref() {
        Ok(x) => x,
    }
}

Current status of unreachable_patterns

The feature allows one to skip branches in a match expression (and similar like if let) when the compiler can determine that there would be no values that match it. Currently this includes:

Where the following types are deemed uninhabited:

This allows one to skip useless branches. Here are examples:

match x { // x: Result<T, !>
    Ok(_) => {}
}
let Ok(y) = x; // x: Result<T, !>
match x {} // x: !
match x {} // x: (u64, !, String)
match x {} // x: Result<!, !>
match x { // x: &!
    _ => {} // necessary because `&!` is considered inhabited
}
match x { // x: &!
    &_ => {} // this one is unreachable, because now the `_` has type `!`.
             // this is a bit awkward: the branch is unreachable, but if we remove it we get
             // a non-exhaustive match.
}
match x { // x: Result<T, &!>
    Ok(_) => {}
    Err(_) => {} // necessary because `&!` is considered inhabited
}
fn safe_unwrap<T>(x: &Result<T, !>) -> &T {
    match x {
        Ok(x) => x,
        // other branch not needed because the `Err` case contains an actual `!`, not a `&!`
    }
}
fn safe_unwrap<T>(x: &Result<T, !>) -> &T {
    match x.as_ref() {
        Ok(x) => x,
        Err(_) => {} // necessary because `&!` is considered inhabited
    }
}

Resolution?

Of the two issues mentioned in the never patterns blog post, the &! one is not a problem because we currently err on the side of inhabitedness for references. The case of partially-initialized tuples is however unsolved. I would have thought that matching on a partially-uninitialized value would be UB, but if it isn't then we need some other way to resolve this issue. I imagine we can also construct enum values with initialized discriminant and uninitialized contents, so allowing matches on partially-initialized values seems to require that we completely disable the exhaustive_patterns feature as it is currently implemented. I don't know how important the partially-initialized case is. If it is, then we need never patterns (or some other good idea); if it isn't, I guess we could stabilize the feature (once perf is fixed)?

Nadrieril commented 3 years ago

Hm, I have been informed that the Rustonomicon explicitly says that "a reference that points to an invalid value" is UB. I would deduce that there can't be a value of type &! even in unsafe code. But this reading seems to also rule out partially initialized values. Unless uninitialized memory does not count as an invalid value? I feel out of my depth here.

varkor commented 3 years ago

cc @RalfJung, who is the expert when it comes to the subtleties of UB.

RalfJung commented 3 years ago

Also see what I wrote here.

Here we might legitimately get such a reference and want to operate on it in unsafe code, so we don't want the branch to be unreachable. To illustrate the second case (playground):

That example is UB: when you read *x.value, that produces a value of type (u32, !), which is UB.

But this reading seems to also rule out partially initialized values.

There are two different things in Rust that people call "partial initialization". There's the safe kind that is understood by the compiler:

let x: (i32, i32);
x.0 = 4;
x.1 = 5;

Here, the not-yet-initialized parts of the data are not accessible to the program (this is ensured statically by the compiler), and thus there is no chance of violating any invariants.

And then there is partial initailization using MaybeUninit, where it is up to the programmer to ensure that only the already-initialized part of the data is read.

I don't know what you mean by "rule out partially initialized value", but when you have a value of type (bool, bool), it must be fully initialized. If you want to work with partially initialized data, that must be reflected in the types, e.g. by using MaybeUninit<(bool, bool)>.

Nadrieril commented 3 years ago

it is up to the programmer to ensure that only the already-initialized part of the data is read

I think the question was exactly whether match x { (v, _) => ... } counts as only reading the already-initialized data.

That example is UB: when you read *x.value, that produces a value of type (u32, !), which is UB.

Oh, I was starting to fear it wasn't. I this settled enough? Will it stay UB in the future, so that we can allow matching on (u32, !) with no branches?

RalfJung commented 3 years ago

I think the question was exactly whether match x { (v, _) => ... } counts as only reading the already-initialized data.

Well, I think it should (I view the _ pattern as expressing that the data is read and then discarded), but I don't think this has been explicitly decided.

Nadrieril commented 3 years ago

Ah, then I'm back to being confused. Shall we continue on Zulip?

Nadrieril commented 3 years ago

First, some good news: I got a PR up that most probably fixes the performance issue. Once/if it gets merged we can confirm if it really fixed perf.

Second, after some and a of on my part, I understand the situation better. I'll explain what I understood, even though it's nothing new. This explains why the feature can't be stabilized as it is today.

Here is what made it click for me: @RalfJung says that whether match x { _ => ... } should lint as unreachable for x: ! is not yet decided. That's because because rust has an access-based model for validity. The lang team may yet decide whether such a match counts as accessing x or not, i.e. whether this asserts the validity of x. If this doesn't assert the validity of x, then we can't make use of the fact that ! has no valid values. So we must not lint the branch as unreachable. I imagine a similar argument applies in the case of match x { Ok(_) => ..., Err(_) => ... } where x: Result<T, !>, which means we can't lint the Err(_) branch as unreachable.

On top of that, there's a subtly different question: can we allow match x { Ok(_) => ... }? (It took me a while to understand these were not the same question). If we do, then such a match asserts the validity of the ! contained in Err case. The problem is not soundness here, but explicitness. As @nikomatsakis explained in his never patterns blog post, we would have the absence of a branch count as reading a value/asserting its validity. Sounds like a big footgun for writers of unsafe code. Even worse: if we add an Err(_) branch it might be reachable. Not great.

All this means we can't stabilise the feature as it currently is. Not even a useful subset of it, since the main motivation is the Result<T, !> case, which as we've seen is problematic. We need some way to make the feature behave differently around unsafe code, and a fix for the explicitness issue. The never patterns idea sounds promising to fix both those issues; that will have to go through the RFC process.

gThorondorsen commented 3 years ago

Note that some people will want Ok(x) to be an irrefutable pattern when matching at type Result<T, !> (see e.g. #71984). A possible solution is to say that irrefutable matches (in let, parameters and for but not if let or while let) may add never patterns as needed when desugaring to an exhaustive match.

I believe it is backward compatible to first stabilize never patterns in match and later, after more discussion, change the behaviour of irrefutable matches as suggested above.

Nadrieril commented 3 years ago

Btw, perf is no longer a blocker: comparison url, measured here.

nikomatsakis commented 3 years ago

I do feel we need the never patterns feature -- or some equivalent thing. This is probably just blocked on never being stabilized.

bhgomes commented 2 years ago

Not sure if this is the expected behavior by the following does not compile:

#![feature(exhaustive_patterns)]
#![feature(never_type)]

pub trait Mode {
    type Known;
    type Unknown;
}

pub enum Value<T, M>
where
    M: Mode,
{
    Known(T, M::Known),
    Unknown(M::Unknown),
}

pub struct KnownOnly;

impl Mode for KnownOnly {
    type Known = ();
    type Unknown = !;
}

fn test(value: Value<bool, KnownOnly>) -> bool {
    let Value::Known(value, _) = value;
    value
}

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=4ddd54ccfd38752dfcf9ae647289976f

In this case, the "exhaustiveness" of a pattern requires checking the trait implementation instead of the types specified on the function boundary. The exhaustiveness check does work if you remove the trait:

#![feature(exhaustive_patterns)]
#![feature(never_type)]

pub enum Value<T, K, U> {
    Known(T, K),
    Unknown(U),
}

fn test(value: Value<bool, (), !>) -> bool {
    let Value::Known(value, _) = value;
    value
}

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=fdde6cee3fece28c003d097df523404b

yaahc commented 2 years ago

I do feel we need the never patterns feature -- or some equivalent thing. This is probably just blocked on never being stabilized.

Has any progress been made on never patterns in the last year or is this still waiting on someone to come along and implement them?

detly commented 1 year ago

Posting a simplified use-case here at someone's suggestion from the Discord.

fn never(a: std::convert::Infallible) -> std::convert::Infallible {
    match a {}
}

fn never_never(
    a: std::convert::Infallible,
    b: std::convert::Infallible,
) -> std::convert::Infallible {
    match (a, b) {}
}

Fails on 1.66.1 with:

error[E0004]: non-exhaustive patterns: type `(Infallible, Infallible)` is non-empty
 --> src/lib.rs:9:11
  |
9 |     match (a, b) {}
  |           ^^^^^^
  |
  = note: the matched value is of type `(Infallible, Infallible)`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern as shown
  |
9 ~     match (a, b) {
10+         _ => todo!(),
11+     }
  |

For more information about this error, try `rustc --explain E0004`.
error: could not compile `playground` due to previous error

Compiles on nightly (1.69.0-nightly (2023-01-22 5e37043d63bfe2f3be8f)) with #![feature(exhaustive_patterns)].

Playground link for stable version, nightly version.

This actually is from real production code — the original (callback-like) functions were generic, and we wanted compilation fail if the type parameters (used not-very-closely to the function definition) ever changed from Infallible. (Also it's nice in an abstract, self-documenting-code way.) So using eg. (_, _) => unreachable!() does not achieve this, nor does (a, b) => a (eg. both compile if the types are set to u8 instead of Infallible).

est31 commented 1 year ago

A stabilization PR has been opened: #110105

Nadrieril commented 1 year ago

I have a proposal that could move stabilization forward (at least regarding never patterns): https://internals.rust-lang.org/t/pre-rfc-lint-empty-arms-to-stabilize-exhaustive-patterns/19746

RalfJung commented 1 year ago

This is a blocking issue for this feature: https://github.com/rust-lang/rust/issues/117119

RalfJung commented 1 year ago

I also think the following should not be accepted:

#![feature(exhaustive_patterns)]
fn main() {
    #[derive(Copy, Clone)]
    enum Void {}
    union Uninit<T: Copy> {
        value: T,
        uninit: (),
    }
    unsafe {
        let x: Uninit<Void> = Uninit { uninit: () };
        match x.value {
        }
    }
}

Using never patterns, this match is equivalent to

        match x.value {
          !
        }

which is different from the _ pattern used in https://github.com/rust-lang/rust/issues/117119. But IMO we should not implicitly introduce ! patterns when the place expression (x.value) involves a union.

RalfJung commented 12 months ago

Does this feature ever recurse through references?

I think this here is completely fine:

fn test(x: &Result<T, !>) {
  match x {
    Ok(v) => ...
  }
}

Here we have an explicit Ok in the source that indicates "please read the discriminant" so it's clearly UB if x is an Err.

However, I am less sure about this one:

fn test(x: Result<&T, &!>) {
  match x {
    Ok(v) => ...
  }
}

To accept this means to basically commit to saying that &! is uninhabited. It's certainly uninhabited in terms of its safety invariant, but for the validity invariant this is an open question that @rust-lang/opsem does not have consensus on.

(For these examples, ! is just a stand-in for any uninhabited type, they should all behave the same.)

Nadrieril commented 12 months ago

Does this feature ever recurse through references?

It doesn't. If there is no user-provided pattern (like Ok(_), &_, (_, _), etc) that makes it inspect the insides of a type, it relies on tcx.is_inhabited_from which considers all references as inhabited.

RalfJung commented 12 months ago

Okay, great. :)

Nadrieril commented 9 months ago

I'm proposing a (hopefully uncontroversial) subset of this feature under the min_exhaustive_patterns feature gate (https://github.com/rust-lang/rust/pull/118803). I have hopes this can be stabilized quicker without needing to figure out never patterns or unsafe subtleties.

Nadrieril commented 7 months ago

Call for Testing

Hi all! The min_exhaustive_patterns feature implements a subset of exhaustive_patterns that can be stabilized soon. The feature is implemented and I am now calling for testing.

min_exhaustive_patterns covers an important part of exhaustive_patterns: it allows omitting an empty pattern when the empty value is matched by-value.

Testing the feature

To test the feature, enable #![feature(min_exhaustive_patterns)]. Compared to stable rust, this feature makes pattern matching take into account empty types in many situations (for details about which, see the tracking issue). In practice, this means you can get extra "unreachable pattern" warnings, and code such as the following is now allowed:

enum Void {}

fn foo() -> Result<String, Void> { ... }

fn bar() {
   let Ok(s) = foo();
   // or
   match foo() {
      Ok(s) => ...,
   }
}

If you already use #![feature(exhaustive_patterns)], try to use min_exhaustive_patterns instead. This will be enough unless you use empty types behind references, pointers, or unions. If min_exhaustive_patterns doesn't cover your usecase (and this isn't a bug), you'll have to keep using exhaustive_patterns for now.

Feedback

If you find a bug or the compiler crashes, please file an issue and tag me.

If you the feature is not doing what you think it should be doing, have a look at the feature description first, and file an issue if you found a case that doesn't match this description.

Nadrieril commented 7 months ago

I'm proposing to stabilize the min_exhaustive_patterns subset: https://github.com/rust-lang/rust/pull/122792

Nadrieril commented 2 months ago

min_exhaustive_patterns is now stabilized! This allows omitting empty arms when matching by-value. The other cases (behind a reference, a pointer, or inside an union) still require exhaustive_patterns.

safinaskar commented 1 week ago

Is this code supposed to work in the future?

struct A(Box<A>);
fn f(x: A) -> ! {
  match x {}

// Or this:
  match x {
    !,
  }
}

A is uninhibited in safe code, so theoretically the code above should compile. Is this planned? (Note: I don't need this feature, this is not feature request, it is just pure curiosity.)

RalfJung commented 1 week ago

No, I am not aware of any plans trying to reason about cyclic types.