dart-lang / language

Design of the Dart language
Other
2.61k stars 200 forks source link

Pattern matching brittleness #3292

Open Hixie opened 10 months ago

Hixie commented 10 months ago

Consider some code such as the following, where A is some class defined in a library:

// library
class A {
  A(this.foo);
  final int foo;
}

// user code
void main() {
  final A a = A(1);
  switch (a) {
    case A(: final int foo): print('A, with foo=$foo');
    default: print('not an A');
  }
}

If the library ever changes A's definition so that foo is now an Object rather than an int, the user code will change semantically -- the default case will now be picked for As that happen to have non-int foos. There does not appear to be any compile-time lint or warning about this.

I suppose one might argue that the error is specifying a type on the binding foo, and that this is the "correct" way to write this:

// library
class A {
  A(this.foo);
  final int foo;
}

// user code
void main() {
  final A a = A(1);
  switch (a) {
    case A(: final foo): foo as int; print('A, with foo=$foo');
    default: print('not an A');
  }
}

...but that seems a bit non-idiomatic (and doesn't really work in the expression version of switch either). I suppose if you use foo in a way that forces its type to be a particular type (e.g. calling methods on it that are specific to its type, passing it to APIs that expect objects of its type, etc) then you lower the risk of the type being changed without the compiler failing without the explicit as. Maybe that's enough, and the assumption is that the rest of the time we don't care if the type isn't obvious in the switch code if it's, in practice, not sensitive to the type anyway?

I'm guessing it's too late to worry about this, and maybe we should just close this issue as "too bad"...

Hixie commented 10 months ago

(FWIW, this came up in a discussion on #hackers-framework on the Flutter Discord.)

munificent commented 10 months ago

There are a few things your pattern might want to express:

  1. Match if the value is an A. If so, extract its foo getter and bind it to a variable with foo's type. This is expressed as:

    case A(:final foo): print('A, with foo=$foo');

    If the type of the foo getter changes, this code will silently change the type of its local foo variable. That may in turn change other code, which could cause compile errors or not.

    It's morally the same as:

    foo(bar());

    If someone changes the return type of bar(), then the behavior of the call to foo() (which may be generic) could change in surprising ways.

  2. Match if the value is an A and the result of calling foo yields an int. That's:

    case A(:final int foo): print('A, with foo=$foo');

    This is idiomatic when you do want to do a runtime test of the extracted value's type. But as you note, it means the case may fail to match if the type foo is changed.

  3. Match if the value is an A and cast the value of foo to an int regardless of what it's static type is. For this, you wrote:

    case A(: final foo): foo as int; print('A, with foo=$foo');

    But you can also do, which is what I'd recommend:

    case A(: final foo as int): print('A, with foo=$foo');
  4. Match if the value is an A. If so, extract its foo getter and bind it to a variable of type int, which should not require a cast.

    This is a reasonable thing to want to write, but the pattern syntax doesn't have a way to express it.

The core problem is that in a refutable context, we overload the typed variable pattern syntax to mean both "declare a variable with this type" and "only match if the value has that type".

@eernst in particular raised this as a concern and suggested there should be a separate syntax for declaring typed variables that are always irrefutable versus those that imply a type test. But as far as I remember, we never came up with a syntax that we liked. It was particularly challenging given all of the other already really hard constraints on fitting pattern syntax into a language clearly not designed for it.

Instead, we chose a compromise that typed variables do type tests. In practice, almost all users rely on type inference for local variables so (4) is, I believe, the least common scenario a user might want to request. So if we had to discard one in order to simplify the overall syntax, I think we probably made the right call.

But you're right that there is an expressiveness hole. If you really want, you could always write:

case A(:final foo_):
  final int foo = foo_;
  print('A, with foo=$foo');
Hixie commented 10 months ago

yeah 4 is what i was hoping for; while it's true that many people use type inference, a subset of our community (including in particular the flutter framework) attempts to avoid all inference in an attempt to catch more bugs.

I guess the compiler will compile as foo to a no-op when the variable is statically known to be a foo? In which case 3 might be sufficient for our needs.

munificent commented 10 months ago

I guess the compiler will compile as foo to a no-op when the variable is statically known to be a foo?

I don't know, you'd have to check what the compilers do and see. You might get "unnecessary cast" warnings that you would have to ignore.

eernstg commented 10 months ago

We could have different takes on the situation where a declared type serves as a dynamic type test, but is intended to be an explicit declaration of a type which is already guaranteed:

@munificent already mentioned that the language team dropped the idea that the dynamic type test and the declaration of a known type could have different syntax. So we won't get that. But we could do something with lints:

If a certain organization makes the choice that known types should not be declared then they could get some help from a lint that flags such known types. It would then be known (unless that lint is ignored) that every variable pattern with a declared type (as in T x or final T x) is a dynamic type test.

Conversely, if known types are generally specified explicitly in variable patterns then every variable pattern that embodies a dynamic type test could be linted. The type test could then be performed in a guard.

Finally, I think there's an entire universe of possible static analyses and diagnostics that could be applied to pairs (or even larger sets) of versions of a given software artifact. The analysis should find a correspondence ("this declaration in VersionA is the same as that declaration in VersionB", and similarly for types, expressions, everything), and it should detect differences at some level and report them. For example "this expression has a new type", "this identifier resolves to a different declaration", "this declaration is the resolution of a different set of identifiers (because it's shadowed by a new declaration)", etc.etc.

Detecting that "the same" variable pattern used to declare a known type, and it is now a dynamic type test, that would just be one more example of a version-pair diagnostic.

lrhn commented 10 months ago

Refutable patters are based on runtime type, irrefutable on static type. That's why failing an irrefutable patterns is a compile-time error, and failing a refutable pattern is a runtime rejection of the pattern. You can ask for runtime checks that aren't statically obvious, like switch (o as Object) { case int _: ... }.

The issue here is that sometimes you want the contract as a subpattern of a refutable pattern.

 case MapEntry(key: int i, value: String s)

Here I know that the key has type int and the value type String, I'm not checking. I know that I'm iterating a Map<int,String>. If that changes, because someone messed with my map type, I want a compile-time error, not just failing to match, and not the silent accept of using var.

You will have to leave the refutable pattern to get irrefutability. Any test performed inside the refutable pattern is either runtime-throwing (as, !) or will just refute the pattern. There is no opt-out for refutability.

Which means you can't get the destructuring from the refutable pattern and also refutability inside the refutable pattern, so you need to either destructure and check outside of it:

case MapEntry m: var MapEntry(key: int i, value: String s) = m; ....

or destructure inside and check outside:

case MapEntry(key: var _i, value: var _s): var (int i, String s) = (_i, _s); ....

But you need to have an irrefutable type check outside of the refutable pattern, if you want a compile-time error if something changes in a breaking way.

rrousselGit commented 10 months ago

Part of me wonder if the issue isn't that the type isn't using the pattern syntax

We could have case Foo(:final int value) vs case Foo(:final int() value). Where the former doesn't do a pattern match, it simply defines the variable type.

munificent commented 10 months ago

We could have case Foo(:final int value) vs case Foo(:final int() value). Where the former doesn't do a pattern match, it simply defines the variable type.

Yeah, but it's not clear how useful that would be. And it would be a shame to give the shorter syntax to something most users don't end up caring to do.

Also, I can't remember the exact details but I believe we ran into a grammar ambiguity with allowing variable names after the () in an object pattern. Something to do with function types?

natebosch commented 7 months ago

Would it be sensible treat :final int value similar to value is int and report a diagnostic about the unnecessary type check? We still wouldn't have a syntax to re-state the inferred type for the local variable, but it could prevent confusion from authors who expect that re-stating the inferred type has no runtime behavior.

munificent commented 6 months ago

Would it be sensible treat :final int value similar to value is int and report a diagnostic about the unnecessary type check?

We already have a lint for omitting redundant local variable types. I would expect it to fire for typed variables in patterns too, but it looks like it doesn't. Filed: https://github.com/dart-lang/linter/issues/4817

natebosch commented 6 months ago

I would expect it to fire

My understanding of this issue is that :final int value has a syntax that looks like a local variable type but semantics closer to a type check. This confusion won't come up for the folks who are omitting local variable types with the lint - it will come up for the authors who prefer to include local variable types everywhere, and intended to do the same in a pattern without realizing they were getting a type check instead of a local variable type like they expected.

I think referring to the int in :final int value as a local variable type would cause more problems than it solves.

munificent commented 6 months ago

My understanding of this issue is that :final int value has a syntax that looks like a local variable type but semantics closer to a type check.

It's both. In an irrefutable context (pattern variable declaration, pattern for statement, etc.) it's a local variable type with no implied type test. In a refutable context (switch case, if-case, etc.) it is both a local variable type and a type test.

Overloading the syntax to mean different things in different contexts is definitely not a perfect solution, but it seemed the least bad of the alternatives.

This confusion won't come up for the folks who are omitting local variable types with the lint - it will come up for the authors who prefer to include local variable types everywhere, and intended to do the same in a pattern without realizing they were getting a type check instead of a local variable type like they expected.

...All the more reason to follow the recommended guidance and not write types for your local variables. :)

The nice thing about consistently omitting the types when they are redundant is that it clarifies that every written type is not redundant: it's an upcast, or a type test, or needed for inference, etc.

natebosch commented 6 months ago

I'd be in favor of an unnecessary type check diagnostic for :final int value (I think this syntax is only possible in refutable contexts) and omit_local_variable_types for :int value in assignment contexts.