rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
670 stars 58 forks source link

Is pattern evaluation order guaranteed? #540

Open zachs18 opened 1 month ago

zachs18 commented 1 month ago

cc https://github.com/rust-lang/reference/issues/1665

For most patterns and in safe code, "evaluation"(/matching?) order of subpatterns does not matter, but there is (that I can think of) one instance on stable where pattern evaluation order matters: matching on a struct with a tag and a union field (and similar situations).

The Reference section on unions does mention pattern matching, but does not say anything about pattern evaluation order. It gives an example of pattern-matching on a manual tagged union, though pattern evaluation order does not matter for the example given[^2]. In a slightly different example, however, the field order does matter:

my original example ```rs #[derive(Clone, Copy)] enum Tag { A, B, } #[derive(Clone, Copy)] #[repr(C)] union Value { a: u32, b: u8, // note that b is smaller than a } /// Assume that if tag == Tag::A, then val.a is valid, and if tag == Tag::B, then tag.b is valid. #[derive(Clone, Copy)] struct Tagged { tag: Tag, val: Value, } unsafe fn tag_first(v: Tagged) -> bool { match v { // fine under miri with tag == B, sees that `tag != A` and skips the arm Tagged { tag: Tag::A, val: Value { a: 0 } } => true, _ => false, } } unsafe fn val_first(v: Tagged) -> bool { match v { // error under miri with tag == B, since it reads the padding bytes after `Value::b` Tagged { val: Value { a: 0 }, tag: Tag::A } => true, _ => false, } } fn main() { let v = Tagged { tag: Tag::B, val: Value { b: 0 }, }; unsafe { tag_first(v); val_first(v); } } ```

a simpler but basically the same example

fn main() {
    union Union { value: u8, _empty: () }
    struct MyOption { tag: u8, value: Union } // assume tag == 1 means value.value is valid
    let foo = MyOption { tag: 0, value: Union { _empty: () } };
    unsafe {
        match foo {
          // currently fine under Miri, since `tag` is mentioned first
          MyOption { tag: 1, value: Union { value: 0 } } => true,
          _ => false,
        };
        match foo {
          // currently this is UB under Miri if value is `_empty`/uninit, regardless of the tag field
          MyOption { value: Union { value: 0 }, tag: 1 } => true,
          _ => false,
        };
    }
}

For unstable code, I suppose deref_patterns might also make it important to document pattern evaluation order, or maybe that feature is/will be restricted enough for it not to matter. Depending on the resolution of https://github.com/rust-lang/unsafe-code-guidelines/issues/412 pattern evaluation order might be important if matching on references-to-references-to-invalid-data (miri example)?

I'm not sure if this is fully the intended behavior[^1], or if it is intended, how best to document it.

[^1]: Alternately, instead of documenting pattern evaluation order, it could be specified that if any (union) field used in a pattern match is invalid/uninitialized, then the whole arm is UB, regardless of the order the fields were written in the pattern.

[^2]: in that example, the union field is fully-initailized either way, or UB happens regardless of pattern evaluation order