Spec and implementations disagree about context for return/yield in function expressions

The analyzer and front end disagree on how to convert the context of a function expression into the context for the operands of return and yield inside the function expression, and they both disagree with the spec.

Paraphrasing from here, and ignoring legacy logic, the spec says to do this:

Let K be the context of the function expression.
If the function is sync (not a generator and not asynchronous), then the context for operands of return in the function expression is K.
If the function is async* and K is Stream<S> for some S, then the context for operands of yield in the function expression is S.
Otherwise, if the function is sync* and K is Iterable<S> for some S, then the context for operands of yield in the function expression is S.
Otherwise, the context for operands of return/yield in the function expression is FutureOr<futureValueTypeSchema(K)> (where futureValueTypeSchema is defined here).

The analyzer behavior differs from the spec in the following ways:

If K is _ or dynamic, then the context for operands of return and yield in the function expression is _, regardless of the function expression's async/generator marker.
Matching of Stream and Iterable is done by ignoring trailing ?s, replacing type parameters with their bounds, and then using "as instance of" semantics (e.g. if K is T&MyStream?, where MyStream extends Stream<int>, then that produces the same result that K=Stream<int> would).

The front end behavior differs from the spec in the following ways:

Matching of Stream and Iterable is done using "union free" semantics (ignoring trailing ?s and unwrapping FutureOr<S> to S), but otherwise requiring a precise match (e.g. if K is FutureOr<Stream<int>?>?, then that produces the same result that K=Stream<int> would, but if K is MyStream, where MyStream extends Stream<int>, then that is considered not to match).
If the function is async* or sync* and K doesn't match Stream (or, respectively, Iterable), then the context for operands of yield in the function expression is _ (not FutureOr<futureValueTypeSchema(K)>).
If the function is async, then the context for operands of return in the function expression is wrapFutureOr(futureValueTypeSchema(K)), where wrapFutureOr is defined as follows:
- wrapFutureOr(FutureOr<S>?) = FutureOr<S>?
- wrapFutureOr(FutureOr<S>) = FutureOr<S>
- Otherwise, wrapFutureOr(S) = FutureOr(S)

That's a lot of behavioral differences! We should pick a behavior to standardize on, and update spec, CFE, and analyzer to all match.

My gut feeling is that the behavior we want is probably a mixture of all three. Perhaps something like this:

Let K be the context of the function expression.
If the function is async*:
- If unionFree(K) is Stream<S> for some S, then the context for operands of yield in the function expression is S. Otherwise, it's _.
- Where unionFree is defined as follows:
- unionFree(S?) = unionFree(S)
- unionFree(FutureOr<S>) = unionFree(S)
- Otherwise, unionFree(S) = S.
If the function is sync*:
- If unionFree(K) is Iterable<S> for some S, then the context for operands of yield in the function expression is S. Otherwise, it's _.
If the function is async:
- Let S be futureValueTypeSchema(K).
- If S is _ or dynamic, then the context for operands of return in the function expression is _.
- Otherwise, it's FutureOr<S>.

But I think that before deciding for sure, it would be worth doing some investigation to see how breaking this would be.

@dart-lang/language-team any thoughts?

Likely related to https://github.com/dart-lang/language/issues/3148, or at least in the same ballpark. Also https://github.com/dart-lang/language/pull/3151, which I really need to get landed RSN. It might, I hope, specify the correct behavior, and maybe even some of what compilers have implemented.

The paraphrasing (surely accidentally) omits one crucial detail: If the context type scheme of the function literal is C, and C is a function type scheme, then the K used in the following steps is the return type (scheme) of that function type (scheme). It's called the imposed return type schema. (If C is not a function type, the imposed return type schema is probably _, which also has its problems with inference.)

That also means that K cannot be an intersection type. If C is an intersection type, like X & int Function(int), we could use the function part of that to derive K, but I don't know if we do. But K itself can never be an intersection type. Phew! One less tricky case to worry about. Well, unless we do something special for x = () { return e; }(); to give () { ... } a context type of typeof<x> Function(), which we may end up doing with vertical inference.

Generally, a function expression with a context type scheme of R Function(...) should behave mostly similar to a function declaration with a return type of R (modulo not all schemes being types, so "greatest closure of" where appropriate). If we infer the return type of the function expression during downward inference, it should be exactly that, but I don't know if we allow ourselves to refine the type during upward inference. In either case, the imposed return type scheme is then used to derive the type needed at return or yield statements, which becomes the context types for those (except in async functions where the implicit await in returns may add an extra FutureOr).

For sync* and async*, the #3151 PR defines functions similar to futureValueTypeSchema, imaginatively named streamElementTypeSchema and iterableElementTypeScheme. They are similar to what you describe, without using a unionFree helper. (Because I just don't trust helper functions when it comes to structural recursion, it's far too easy to miss something. I want every case handled explicitly, where I can see them!) The biggest difference is that a top-type of dynamic or void also becomes the element type. Just because it makes some kind of sense to me that a dynamic foo() async* { ... } should have a context type of dynamic for elements, and void foo() async* { ... } should use void. Not because it makes any real difference in a contravariant context.

The rule for async functions should probably incorporate the context rules for await, as mentioned in another issue. I wasn't aware of that rule when I specified async return, or wrote #3571, otherwise it should have done the same to the return context type as await does. (Or I should move on #870 and get rid of the FutureOr entirely.)

dart-lang / language

Spec and implementations disagree about context for return/yield in function expressions #3664