dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.07k stars 1.56k forks source link

Type inference is confusing on pattern context for List<String>.map(jsonDecode) #54676

Open whesse opened 7 months ago

whesse commented 7 months ago

Applying type inference on List\ a; [ for (final {'event': String event} in a.map(jsonDecode)) switch(event) .... ];

gives an error where the type dynamic can't be inferred for the .map call.

Changing the call to .map\ makes the error go away.

The error is rather confusing, saying that it tries the candidate dynamic but it doesn't work. The error (from tip-of-tree 'dart analyze', also is the same on stable 3.2.3) is

error • benchmarks.dart:3575:20 • Couldn't infer type parameter
          'T'.

          Tried to infer 'dynamic' for 'T' which doesn't work:
            Return
          type declared as 'Iterable<T>'
                        used where 
          'Iterable<Map<_, String>>' is required.
          The type 'dynamic' was
          inferred from:
            Parameter 'toElement' declared as     'T
          Function(String)'
                                  but argument is 'dynamic
          Function(String, {Object? Function(Object?, Object?)?
          reviver})'.

          Consider passing explicit type argument(s) to the
          generic.

           • could_not_infer

1 issue found.

Sample code showing the error is below: The code runs without error, does not fail due to type inference failure.


import "dart:convert";

List<String> lines = [];

class BenchmarkParseResult {
  double counter;

  BenchmarkParseResult(this.counter);
}

List foo(int iterations) =>
   [
      for (final {'event': String event, 'counter-value': String counterString}
          in lines.map(jsonDecode))
        ...switch (event) {
          'cycles:u' => [
              BenchmarkParseResult(
                  double.parse(counterString) / iterations)
            ],
          'instructions:u' => [
              BenchmarkParseResult(
                  double.parse(counterString) / iterations)
            ],
          _ => [],
        }
    ];

void main() {
  print(foo(1));
}

Simpler cases do not give an error:

for (Map foo in lines.map<dynamic>(jsonDecode))
for (final foo in lines.map(jsonDecode))

The simple case without the pattern (or the control-flow expression) does give an error, but it is more comprehensible. With the pattern instead, it gives the error shown above:

for (Map foo in <String>[].map(jsonDecode)) {}
for (final {'foo': int foo}  in <String>[].map(jsonDecode)) {}

The comprehensible error tries to propagate the Map type in to the jsonDecode argument:

The argument type 'dynamic Function(String, {Object? Function(Object?, Object?)? reviver})' can't be assigned
to the parameter type 'Map<dynamic, dynamic> Function(String)'.dart[argument_type_not_assignable]
(https://dart.dev/diagnostics/argument_type_not_assignable)

@johnniwinther @athomas

stereotype441 commented 7 months ago

Thanks for the bug report, @whesse! I dug into it, and your test case highlights a subtle difference between type inference in the analyzer and the CFE. Here's what's happening:

The question is, what do we want to happen? I can't find anything in https://github.com/dart-lang/language/blob/main/resources/type-system/inference.md specifying whether or not it should be an error if the set of type constraints is contradictory.

I can think of good arguments in both directions. On the one hand, if type inference comes up with contradictory type constraints, emitting an error seems like good thing, because mathematical systems like type theories are well known to behave in really weird and counterintuitive ways when trying to reason from a contradiction; it's probably better to just bail out and let the user decide what to do. On the other hand, the type produced by type inference in this case actually works just fine, and so producing an error message is really confusing for the user (as @whesse observed, the analyzer seems to be claiming that a type parameter of dynamic doesn't work, yet supplying an explicit type parameter of dynamic actually does work).

I think it's also worth considering whether we can glean some insights from this example into how type inference could be improved in the future. For example, we have been considering adding additional kinds of type schema "operators" to represent sets of types that aren't expressible using the current framework; if we had an operator coercibleTo(T) (representing any type that is usable where an expression of type T is expected), then maybe the context for the expression lines.map(jsonDecode) could have been Iterable<coercibleTo(Map<_, String>)>. Then the constraints for T would have been dynamic <: T <: coercibleTo(Map<_, String>), which isn't contradictory, because dynamic is coercible to Map<_, String>. But this idea would need a lot more careful thought.

In any case, I think these are questions for the language team.

TL/DR: if, during type inference, a set of contradictory type constraints is created, should that be a de facto compile-time error? Or should it only be a compile-time error if it leads to a set of types being chosen that cause problems in later analysis? The analyzer takes the former approach; the CFE takes the latter.

What do you think, @dart-lang/language-team?

eernstg commented 7 months ago

I would say that contradictory type constraints should be detected, and the situation should be considered to be a type inference failure ("could not solve the constraints").

The response to a type inference failure could vary: A compile-time error could be reported. Alternatively, dynamic could be used with every type variable whose constraints do not have a solution (if that's well-defined, we could have mutual dependencies between type variables via their bounds). I would prefer to have a warning if we do something like the latter. (I think we do both today, in different situations.)

However, I don't think we can ignore the fact that the constraints have been violated (say, if we have num <: X <: int and choose X to be num), because it could create soundness issues if an actual type argument in an instance creation or in an invocation of a generic method violates its bound. Those guys can't even be super-bounded!

leafpetersen commented 7 months ago

A couple of preliminary comments.

First, here's a simpler reproduction:

// The argument to this function will be checked in a context
// that looks like Map<_, String>
void mkMapContext<T>(Map<T, String> argument) {}

// This method returns a T where T is its argument type.
T id<T>(T x) => x;

void main() {
  dynamic d = <dynamic, String>{};
  mkMapContext(id(d));
}

Second: I'm not directly concerned about soundness here. Because of the dynamic nature of Dart, there are lots of places where inference (or other algorithms) may produce unsuitable types. There are type systems in which if inference succeeds, the program is well-typed: Dart does not have such a system. So in general, it is incumbent upon the front-ends to do the necessary assignability, bounds, and well-boundedness checks after inference. If they are not doing so correctly, that is of course potentially a soundness bug, but I don't think that's really the core of this issue.

Third: This is, I think, a consequence of the fact that in a few cases (mostly around implicit dynamic casts) context types are not requirements, but the subtype matching algorithm in inference essentially treats them as if they were. That is, when we have a context K and we're matching it against a type T solving for X, we treat the result as constraints on X even though in some sense it isn't a hard requirement. So for example, if K is int and T is just X, then subtype matching X <# int will yield the constraint that X <: int must be true. But that constraint is not actually a hard constraint in Dart, since choosing X to be dynamic will potentially also be a valid solution.

I don't immediately have in mind what the right answer here is. A couple of thoughts:

I think my weak preliminary thought is that this check should only be done when strict-inference is enabled, but I'm not sure about that. Some further analysis/discussion may be in order.

lrhn commented 7 months ago

I can see how we end up with dynamic <: T and T <: Map<X, String> for

void mkMapContext<X>(Map<X, String> argument) {} // Renamed type parameter to `X` to not have two `T`s.
T id<T>(T x) => x;
void main() {
  dynamic d = <dynamic, String>{};
  mkMapContext(id(d));
}

The mkMapContext imposes a context type of Map<X, String> on id(d), id imposes a context of T on d with T <: Map<X, String>, d has type dynamic so dynamic <: T.

My feeling is that it's a problem because the type system ignores implicit downcasts, and the code author does not. The dynamic <: T is not a real requirement that can fail when the LHS is the type of an expression and the RHS is a context type. A dynamic is a valid type in that context, because it's assignable to T, whatever T ends up being, so it doesn't introduce any constraint.
It's more of a "the type was dynamic" marker than a real constraint.

I don't want to not introduce any constraint, because id(d) by itself should have type dynamic, So I probably want a different kind of constraint. Say, for something like this, we introduce the constraint dynamic <:= T, which means dynamic has occurred and is assigned to T. That doesn't tell us what T is, but if we have no other constraint on T, we'll just use dynamic. In either case we'll downcast the dynamic to T, which may just happen to be trivial.

Here we do have a constraint, T <: Map<X, String>. I'd then try to solve dynamic <:= T and T <: Map<X, String> by choosing T = Map<X, String> and add the constraint dynamic <:= X. (So we'll downcast the dynamic to Map<X, String>, whatever X ends up being.)

When we then don't find any other constraints for X, we let X be dynamic, and the inference becomes:

  mkMapContext<dynamic>(id<Map<dynamic, String>>(d /* as Map<dynamic, String>*/));

I'm probably missing a lot of details, but it feels like something like this would match the actual behavior of the program, and how the auther is thinking about dynamic. (I know I did, reading mkMapContext(id(d));, I filled in the dynamics exactly like that, solving Map<_, String> against dynamic by putting dynamic into the _, and not, e.g., using instantiate to bounds.)

stereotype441 commented 7 months ago

@lrhn, I'm encouraged by your idea. Phrasing it another way, it seems like what you're doing is trying to encode the possibility of an implicit dynamic downcast into the type constraints, so that when it comes time to solve for the type variables, we can see that there is a solution rather than failing.

But I'm having trouble trying to generalize it so that it can apply to @whesse's initial example. Removing some non-essential parts for simplicity, we've got this:

import "dart:convert";

List<String> lines = [];

List foo(int iterations) => [
      for (final {'event': String event} in lines.map(jsonDecode))
        ...switch (event) {
          _ => [],
        }
    ];

void main() {
  print(foo(1));
}

In this case, we only have a single generic method to infer (lines.map), and the constraints are coming from:

I can't figure out how to apply your <:= notation to this in a way that makes it work.

I have some ideas (along the lines of the coercibleTo operator I mentioned in my previous comment), but it seems like you might be onto something simpler than that, so I'd like to understand your approach better before diving into mine.

lrhn commented 7 months ago

Ack. It doesn't work because we don't actually have any dynamic-typed expressions here. We have a function context which expects a function which returns T, and a function which actually returns dynamic (not a static type, the actual reified return type of the runtime function), so in that case dynamic <: T is a strong requirement. (Which can be solved by making T be dynamic, if nothing else got in the way.)

So more cleverness might be needed. (I did say I was probably missing something.)

The implicit downcast that I, as a reader, would expect here is at the destructuring assignment. If the lines.map(jsonDecode) would infer Iterable<dynamic>, then everything works. That's the type it would have with no context type, so if we could give it that type with a context type too, if the context type is not actually useful in inferring something else, maybe we'd be fine. There are no implicit downcasts in Then thatdynamic` element type is the type of the value assigned to the declaration.

If we had just rejected a List<Map<String, dynamic>> where we expect a Map<String, String> element type, then that would be consistent. We can't statically ensure the type. But we allow it, even if fully typed:

void main() {
  for (var {"a": String s} in <Map<String,dynamic>>[<String, dynamic>{"a": "b"}] as dynamic ) {
    print(s);
  }

  for (var {"a": String s} in <Map<String,dynamic>>[<String, dynamic>{"a": "b"}] as Iterable<dynamic>) {
    print(s);
  }

  for (var {"a": String s} in <Map<String,dynamic>>[<String, dynamic>{"a": "b"}] as Iterable<Map<dynamic, dynamic>>) {
    print(s);
  }
}

These all work. A for (T v in e)... allows e to be dynamic, in which case it's downcast to Iterable<dynamic> first, and if e it's an Iterable<T2>, then T2 doesn't necessarily need to be assignable to T, it needs to be destructurable to T. That's a lot of special casing, and no wonder it's hard to figure out what to do. I assume this comes from the specification of for/in still being by syntactic desugaring, where for (decl in expr) body becomes:

  TypeOf(expr) $iterable = expr; // Compile-time error if `TypeOf(expr)` is not assignable to `Iterable<dynamic>`.
  var $iterator = $iterable.iterator; // Compile-time error if type of `$iterator` is not assignable to `Iterator<dynamic>`
  while ($iterator.moveNext()) { // Compile-time error if type of `$iterator.moveNext()` is not assignable to `bool`
    decl = $iterator.current; // Compile-time error if type of `$iterator.current` is not assignable to `decl`.
    { body }
  }

This desugaring happens post type-inference (has to, otherwise it cannot know the type of expr, and inferring on the desugared code would not give expr the correct context type), so the inference doesn't know about this desugaring. But the static and runtime semantics do, which is why we allow expr to be any of dynamic, Iterable<dynamic> and Iterable<Map<dynamic, dynamic>>, because we are doing multiple assignments, each one being able to do dynamic downcast independently.

So to match runtime behavior, inference would have to understand this rewrite, and match it in the inference rules. Which is probably non-trivial.

So, let's try again. We introduce S <:= T in general, meaning "S assignable to T", which can be used as type parameter bound as well, so a type parameter can be X <:= T, remembering its constraint. Then we say that the static typing of for/in is:

Let C be the context type induced by the declaration. Introduce a type variable $E <:= C. And a type variable $I <:= Iterable<$E> And then we infer the iterable expression with context type $I, to type T, and add the constraint T <:= $I

Then we try to solve. Somehow (not my speciality, I'll admit, so I'll handwave).

For the var {"a": String a} in list.map(jsonDecode) case we get:

As I said, "hand-waving". I don't know where we'd want to insert the $E <:= T instead of just T as context type. Anywhere and everywhere? Could be unwieldy. And I don't know how it will interact with the placement of implicit downcasts. If anything, it'll probably delay them as long as possible, because it can keep solving with dynamic <:= T for any T, until it reaches an actual constraint, like an assignment or argument position.

And I'm also not sure this isn't just "context type hints" with a more cumbersome syntax.

chloestefantsova commented 7 months ago

Say, for something like this, we introduce the constraint dynamic <:= T, which means dynamic has occurred and is assigned to T.

That's an interesting proposal. If we include coercions into the notion of the assignability constraint, we may finally solve the problem of giving a formal treatment to where assignments and the related checks and transformations actually happen in expressions.

stereotype441 commented 7 months ago

I feel like we've had a lot of good discussion here about potential future improvements to type inference to allow @whesse's example to work without having to choose a solution to contradictory type constraints. I'm definitely interested in those discussions for the long term, but they all sound like big enough changes to the type system that we'd probably want to tackle one of them as a language-versioned type inference improvement. So perhaps we should move that discussion to a separate language issue.

In the shorter term, we have a situation right now, for the current language version (and all previous language versions that are still supported) where there is a difference in behavior between the analyzer and the front end: if type inference leads to contradictory constraints, the analyzer rejects the program outright, whereas the front end "solves" the contradictory constraints by picking the "lower" constraint, and only rejects the program if that choice leads to a static type error.

I would love for us to declare one of these two behaviors to be the "correct one", and make a fix so that the two implementations match. Let me try to make a stab at the pros and cons of each approach:

  1. Reject any program that leads to contradictory type inference constraints. (This is what the analyzer currently does.) Pro: seems like a good idea from first principles. Also, it encourages users to add type arguments to clarify what behavior they want. Con: it's conceivable that some program in the wild could be broken by this change. However, this is mitigated by the fact that the analyzer already rejects such a program; therefore this approach would not be considered a breaking change according to our breaking change policy.
  2. Don't consider contradictory type inference constraints to be a compile-time error per se; only reject the program if the existing algorithm for "solving" the constraints (picking the "lower" constraint) leads to a compile-time error for some other reason. (This is what the front end currently does.) Pro: guaranteed not to break any existing program. Con: might conceivably cause a program to be silently accepted that really doesn't behave the way the user expects.
  3. Same as 2, except that if the analyzer is in strict inference mode, it rejects the program. (This is essentially what Leaf proposed above). Pro: non-breaking just like 2, but with the additional advantage that users who have already opted into strict inference will at least be alerted to the fact that something wonky is going on. Con: users who haven't opted into strict inference won't find out that something wonky is going on.

Personally, 3 is my first choice, and 2 is my second. But I could live with any of the three options.

eernstg commented 7 months ago

Personally, 3 is my first choice,

I'd prefer that one as well. It could then be strengthened to 1 later on.

lrhn commented 7 months ago

Do we know that this only affects dynamic (as in: If one of the types involved is not dynamic, then failing to find a solution implies that the original program was wrong)?

If so, would a specification of the 2. behavior be to ignore all constraints of the type dynamic <: T, or at least ignore them if they conflict with something else?

I'm OK with 3, but for any choice, we'll also need to specify the actual behavior we end up at.

stereotype441 commented 7 months ago

@lrhn

Do we know that this only affects dynamic (as in: If one of the types involved is not dynamic, then failing to find a solution implies that the original program was wrong)?

I don't think so. It could also apply any time the context type is not a required type, and contains a _. This could happen, for instance, when the context type comes from a promoted local variable. For example:

T id<T>(T x) => x;

f(Object o, Map<String, int> m) {
  if (o is String) {
    {'x': o} = id(m);
  }
}

Here, the pattern type schema for {'x': o} is Map<_, String>, based on the promoted type of o being String. So the constraints on the type parameter T of id are:

The front end favors the "lower" constraint, so it picks T = Map<String, int>, and this works; it means that the assignment simply demotes o back to Object.

The analyzer gives this error message:

  error - test.dart:5:16 - Couldn't infer type parameter 'T'.

          Tried to infer 'Map<String, int>' for 'T' which doesn't work:
            Return type declared as 'T'
                        used where 
          'Map<_, String>' is required.
          The type 'Map<String, int>' was inferred from:
            Parameter 'x' declared as     'T'
                          but argument is 'Map<String, int>'.

          Consider
          passing explicit type argument(s) to the generic.

           - could_not_infer
FMorschel commented 7 months ago

I'm sorry if I'm mistaken, but is this similar to what happened at #52117? The downcasting, I mean.

stereotype441 commented 7 months ago

@FMorschel

I'm sorry if I'm mistaken, but is this similar to what happened at #52117? The downcasting, I mean.

I'm not seeing a relationship between the two issues. But if you have some insight to share here, I'd love to hear more details.

FMorschel commented 7 months ago

As I said, I'm not sure, since some of the inference logic is new to me, but I figured it was similar because of https://github.com/dart-lang/sdk/issues/52117#issuecomment-1516918340:

[...] type inference is weird in this invocation. See below what we have currently, but don't consider it to be the right expectation. Apparently we don't infer R correctly. [...]

My question for that would be why is R inference the problem and not T.

More context here ```dart extension Apply on T { R apply(R Function(T self) transform) => transform(this); } void func(T object) { if (object is DateTime?) { final formatter = DateFormat('yyyy/MM/dd'); object?.day; // No lint object?.apply(formatter.format); // The argument type 'String Function(DateTime)' can't be assigned to the parameter type 'dynamic Function(T)'. } } ```

Since your comment above (mentioned below) I figured the problem there was the demoting of DateTime to Object when inferring Apply<T extends Object>'s T

This could happen, for instance, when the context type comes from a promoted local variable

[...] it means that the assignment simply demotes o back to Object.

P.S.:

I'm mostly sure you already know this info, but just to put it here as well since it relates to promoted local variables, and maybe it's part of the problem in this case. I found https://github.com/dart-lang/sdk/issues/56028:

[...] while a type variable itself has no members, it gets every member except "The Object 5" from its bound or promotion, where the promotion is known to be a subtype of the bound [...]

I'm sorry if this is not helpful or is unrelated, for me it sounded like a similar problem.

stereotype441 commented 7 months ago

@FMorschel Thanks for the additional details! I think I understand where you're coming from now. After digging a little deeper into things, I think the two issues turn out to be different.

In the case of https://github.com/dart-lang/sdk/issues/52117#issuecomment-1516918340, we had two issues:

details about #52117 When @scheglov says "Apparently we don't infer `R` correctly", I believe he's referring to this part of his code snippet: ``` methodName: SimpleIdentifier token: apply staticElement: MethodMember base: self::@extension::E::@method::apply substitution: {T2: T, R: R} staticType: void Function(R Function(T)) ``` (This is part of the text passed to `assertResolvedNodeText`, which means it's part of a dump of the data structures produced by the analyzer in analyzing the code example. In particular this is a dump of the analyzer's data structure for the simple identifier `apply` in the line `object?.apply(g);`.) The line `base: self::@extension::E::@method::apply` means that `apply` got resolved to the extension method `E.apply`, which seems correct. The definition of that extension method is: ```dart extension E on T2 { void apply(R Function(T2 self) transform) {} } ``` Note that @scheglov helpfully renamed `T` to `T2` when he created his example, so it could be distinguished from the `T` that's in scope at the point where `apply` is _used_: ```dart void f(T object) { if (object is int?) { object?.apply(g); } } ``` Since `apply` is defined as a generic method inside a generic extension, that means that in order to resolve it, the analyzer has to select types that are valid at the location where `apply` is used, and substitute those types in place of the type parameters of both generics (`T2` and `R`). The only type parameter that it should be using in the substitution is the one type parameter that's in scope at that point, which is `T`. But this line of the dump: ``` substitution: {T2: T, R: R} ``` indicates that the type the analyzer has chosen to substitute in place of the type parameter `R` is... `R` itself. That's not just invalid, it's not even well-formed! Because `R` doesn't have any meaning at the point where `apply` is used; it's not in scope there. You ask why `R` is the problem and not `T`. I think that's a valid question, because the only effect that `R` has on the analysis is to determine the return type that `g` has to have (`g` is defined as `double g(int a) => 0.0;`; it corresponds to `formatter.format` from your example). Whereas the user-visible problem in https://github.com/dart-lang/sdk/issues/52117 is that if `object?.apply(g)` is changed to `(object as int?).apply(g)`, a bogus `unnecessary_cast` lint appears. This change doesn't affect `g`; it just affects the type of the thing that the extension is getting _applied to_. That type is either `T&int?` or simply `int?` in @scheglov's reduced example, and those types don't involve `R`, and it's getting matched against the extension's "on" type (`T2`), which doesn't involve `R`. So why would the type of `R` be relevant? My answer to that question is: I think you're right. The ill-formed type being substituted for `R` probably _isn't_ the cause of the bogus lint (or I guess I should say, it _wasn't_ the cause of the bogus lint, since from [your comment on the issue today](https://github.com/dart-lang/sdk/issues/52117#issuecomment-1921146770) it looks like the issue isn't happening anymore). That's not to say that the problem with `R` shouldn't be investigated; it probably should, as a separate issue. I'll go over there and make a comment to encourage someone to investigate it. But anyway, back to the crux of your question, which is, is what's going on here in _this_ issue similar to what happened in https://github.com/dart-lang/sdk/issues/52117?

What's happening here in this issue is that type inference is trying to simultaneously satisfy two constraints for the type parameter T that are not possible to satisfy at the same time. The analyzer code detects that impossibility and gives an error. The front end code plows ahead, choosing to substitute dynamic for T, and it so happens that this works because dynamic can be downcast to any other type. There's no ill-formed type involved, it's just that the two implementations disagree on what to do when two type constraints are contradictory. For more details see https://github.com/dart-lang/sdk/issues/54676#issuecomment-1900838420.

So no, I don't think the issues are related. But thanks for asking! I hope my answer makes things clearer.