Consider keeping supertype in joins

dart-lang / language

Design of the Dart language

Other

2.67k stars 205 forks source link

Consider keeping supertype in joins #2856

Open stereotype441 opened 1 year ago

stereotype441 commented 1 year ago

Currently, when flow analysis joins promotion chains in two control flow paths, it keeps only types that are common to both paths. So, for instance, if the declared type of x is Object?, and one control flow path promotes it to Object, then to num, then to int, whereas the other control flow path promotes it to num?, then to num, then to double, the only promotion that is kept is the promotion to num, because it is the only promotion that's common to both promotion chains.

When one promotion chain contains a type that's a subtype of a type in the other promotion chain, this can be surprising for users. For example, the following example came up in https://github.com/dart-lang/sdk/issues/51464:

void f(num? x) {
  switch (x) {
    case final a when a is int:
    case final a when a is num:
      a; // (1)
  }
}

It seems intuitively like a should be promoted to num at (1), because int is a subtype of num. And it would be sound to do so. But flow analysis doesn't promote a at all, because the two promotion chains are num? :> int and num? :> num, and the only common type between those two chains is num?.

Should we try to improve this? We could instead say that when joining a promotion chain that ends in U with a promotion chain that ends in V:

If U <: V, then the resulting promotion chain ends in V;
Otherwise if V <: U, then the resulting promotion chain ends in U;
Otherwise both U and V are dropped and the resulting promotion chain is formed by combining the remainder of the two promotion chains.

(We'd have to flesh out this definition to determine what happens to the rest of the promotion chain, but I don't think that would be too difficult).

The advantage would be that code like the example above would work in the way the user intuitively expects. The disadvantage would be the same disadvantage that comes with any change to type inference: risk of subtle and annoying changes to the behaviour of existing code.

CC @dart-lang/language-team

lrhn commented 1 year ago

This suggests that the types of interest at a join should be the union of types of interests from the join-paths, which is not unreasonable — someone showed an interest in that type somewhere in the current function.

That makes it valid to promote to a promoted type from either branch, because that type is still a type of interest afterwards.

It should be possible to define a reasonable algorithm for joining promotion two chains in a sound and useful way.

At least in most practical cases. There'll probably still be some pathological cases where we end up at the declared type, due to diamond shaped type hierarchies.

Lets assume we have a variable x with declared type T and two promotion chains T :> T1 :> T2 :> .. :> Tn and T :> S1 :> ... :> Sm.

We can immediately drop any type Si where there is no Tj s.t. Tj <: Si. We don't know that the actual value has type Si, because it's possible to go through the other path without checking for it. If all the remaining types form a promotion chain, we can use that, but they might not.

Imagine the class hierarchy:

and promotion chains A1 :> B1 :> B2 :> B3 :> A4 and A1 :> C1 :> C2 :> C3 :> A4. Every type has a subtype occurring in the other chain. We know that both, e.g., B2 and C2 are valid promotions for the current value, but neither is more significant than the other, so choosing one would be arbitrary. If we stop before promoting to A4, then B3 and C3 no longer have subtypes in the other chain. And again, both B2 and C2 (which are now the lowest types in the chains) are valid promotions, and neither is more significant. We can't reasonably choose either.

I can definitely come up with an algorithm which is deterministic and symmetric in the inputs, but it's going to make some choices at some point, and it's not clear that it's the only reasonable choice to make. It's just one, consistent and predictable, choice.

Intersection is unique. Any choice of picking a type from one side, but not from the other, risks feeling arbitrary.

(Another example is the two chains Object? :> Object :> num :> int and Object? :> num? :> int? :> int. I have a feeling that choosing Object? :> num? :> num :> int is the "best" choice, but I cannot explain why, and it's probably just an accident of how I draw the graph.)

eernstg commented 1 year ago

I think it would be possible to define an intersection operation which is well defined and meaningful in practice, but it seems very likely that it would need a tie-breaker mechanism.

I'm assuming that the current intersection operation proceeds as follows:

The value of p1 ^ p2 is a list that contains the set intersection of the elements of p1 and the elements of p2, ordered by subtyping.

(This rule is ambiguous in the case where we have types like List<dynamic> and List<void>, but we could then rely on the ordering in p1, or in p2, or perhaps use MORE_TOP. That kind of ambiguity will most likely be rare.)

Note that this definition works equally well if we consider p1 and p2 to be not just the list which is usually known as promoted, but instead that list followed by the declared type, promoted ++ [declared].

In order to 'keep the supertypes' we could consider the following variant of the intersection operation, written as ^^:

To compute p1 ^^ p2, consider the graph G whose nodes are all elements in p1, all elements in p2, and all types which are of interest based on those elements of p1/p2; the edges connect any two nodes n1 and n2, in that order, if and only if n1 <: n2. The result is then the longest path in this graph. If there are multiple paths with the same maximal length then p1 ^^ p2 is the path which is minimal, if any; if no path is minimal then p1 ^^ p2 is p1 ^ p2.

A minimal path is least according to the following ordering, <<:: h1::t1 <<: h2::t2 if h1 <: h2 and t1 <<: t2, and [] <<: [].

In other words, we take every type of interest into account, and we prefer the promotion chain that yields the most special type at every step, if any.

With @lrhn's multi-diamond hierarchy shown here, we would have [A4, B3, B2, B1, A1] ^^ [A4, C3, C2, C1, A1], and the result would be [A4, A1] because the above mentioned graph would have several distinct maximally long paths (containing 5 nodes), and none of them is minimal. Note that we do not include A2 and A3, which fits the usual approach whereby types don't become 'types of interest' unless there is a reason in the source code that makes them so, but it is still true that C3 <: B2, etc.

The other example was [int, num, Object, Object?] ^^ [int, int?, num?, Object?]. The nodes of the graph are {Object?, Object, num?, num, int?, int} and again we have multiple maximal paths (containing 4 nodes), and none of them is minimal, so the result is [int, Object?].

However, if we consider [int, num, Object?] ^^ [int, num?, Object?] then we get [int, num, num?, Object?], and surely there will be many other situations where ^^ doesn't collapse into ^.

So if we do want to keep those supertypes in a promotion join then I think it will be possible.

stereotype441 commented 1 year ago

@eernstg said:

I think it would be possible to define an intersection operation which is well defined and meaningful in practice, but it seems very likely that it would need a tie-breaker mechanism.

I'm assuming that the current intersection operation proceeds as follows:

The value of p1 ^ p2 is a list that contains the set intersection of the elements of p1 and the elements of p2, ordered by subtyping.

(This rule is ambiguous in the case where we have types like List<dynamic> and List<void>, but we could then rely on the ordering in p1, or in p2, or perhaps use MORE_TOP. That kind of ambiguity will most likely be rare.)

Interesting! My intention was that this kind of ambiguity would never arise, because of another invariant of flow analysis: every entry in a promotion chain should always be a strict subtype of the previous one (where strictness is defined in terms of mutual subtyping, so that no entry in the chain is a supertype of the previous one). This would avoid the ambiguity by making it impossible for both List<dynamic> and List<void> to appear simultaneously in any promotion chain (and hence, impossible for them both to appear simultaneously in the intersection of any two promotion chains).

But looking at the code, it looks like I failed to follow through on that design intention, because what I wound up doing was to define strictness in terms of type equality rather than mutual subtyping. Which means that you can get arbitrarily long promotion chains with nonsense like this:

if (x is! List<dynamic>) return;
if (x is! List<void>) return;
if (x is! List<dynamic>) return;
if (x is! List<void>) return;
...etc...

The current implementation doesn't do very sensible things if it tries to intersect two of these sorts of promotion chains. Nothing unsound, mind you, just not very sensible. I'll put it on my list to fix that.

Note that this definition works equally well if we consider p1 and p2 to be not just the list which is usually known as promoted, but instead that list followed by the declared type, promoted ++ [declared].

In order to 'keep the supertypes' we could consider the following variant of the intersection operation, written as ^^:

To compute p1 ^^ p2, consider the graph G whose nodes are all elements in p1, all elements in p2, and all types which are of interest based on those elements of p1/p2; the edges connect any two nodes n1 and n2, in that order, if and only if n1 <: n2. The result is then the longest path in this graph.

I think this is missing an important step: we need to first discard any elements from p1 that are not supertypes of the first element of p2, and discard any elements from p2 that are not supertypes of the first element of p1. Otherwise, joining [int, num, Object] with [num, Object] will result in [int, num, Object], which is clearly not what we want.

If there are multiple paths with the same maximal length then p1 ^^ p2 is the path which is minimal, if any; if no path is minimal then p1 ^^ p2 is p1 ^ p2.

A minimal path is least according to the following ordering, <<:: h1::t1 <<: h2::t2 if h1 <: h2 and t1 <<: t2, and [] <<: [].

In other words, we take every type of interest into account, and we prefer the promotion chain that yields the most special type at every step, if any.

With @lrhn's multi-diamond hierarchy shown here, we would have [A4, B3, B2, B1, A1] ^^ [A4, C3, C2, C1, A1], and the result would be [A4, A1] because the above mentioned graph would have several distinct maximally long paths (containing 5 nodes), and none of them is minimal. Note that we do not include A2 and A3, which fits the usual approach whereby types don't become 'types of interest' unless there is a reason in the source code that makes them so, but it is still true that C3 <: B2, etc.

The other example was [int, num, Object, Object?] ^^ [int, int?, num?, Object?]. The nodes of the graph are {Object?, Object, num?, num, int?, int} and again we have multiple maximal paths (containing 4 nodes), and none of them is minimal, so the result is [int, Object?].

However, if we consider [int, num, Object?] ^^ [int, num?, Object?] then we get [int, num, num?, Object?], and surely there will be many other situations where ^^ doesn't collapse into ^.

So if we do want to keep those supertypes in a promotion join then I think it will be possible.

Sounds reasonable to me.

lrhn commented 1 year ago

We have both "types of interest" and "promotion chains".

As I remember it, the promotion chain is used when joining control flows, to ensure that we don't lose types that both branches agree on, even if they both further promoted to incomparable types. What we had before was only remembering the most recent promoted type, and then we lost information that both branches agreed to when joining after a split.

And intersections of totally ordered chains with the same root are guaranteed to give a totally ordered chain with the same root, so it's safe and sound.

What's suggested here is that we keep promotion information from one branch that the other branch has tested for, as long as it's not contradicted by the other branch.

The problem is that this includes all types which are supertypes of any type in the other chain, so we first remove the tails that are not compatible with the other branch (by necessity), and then we are left with two chains that both safe and sound to use after the join, but their union is not necessarily a chain.

And our merge should preferably be associative, so this generalizes to merging N individual unrelated chains. If we merge D :> P1 :> P2 and D :> P3 :> P4, should it give the same result as merging D :> P1, D :> P2, D :> P3 and D :> P4? Probably, because merging D :> P1 and , D :> P2 should give D:> P1 :> P2.

Which means that our merge should just work on arbitrary sets of subtypes of the declared type, always including the declared type itself.

That sounds ... tricky.

So, take a number of promotion chains. Drop all elements that do not have a subtype in every other chain. Then "merge" all the remaining types.

My first guess would be to start by at least taking the types that are in the intersection, and then adding some more types, but that's not necessarily associative.

Can we drop the "chain" requirement entirely?

Say we define a promotion set as a set of types with a single minimal and a single maximal element, where the maximal element is the declared type of the variable, and the minimal element is a subtype of that which is the currently promoted type of the variable, and all other types in the set are types of interest which are properly between those two.

We then have to define a reasonable merge operation on such sets. Which brings us back to merging arbitrary sets, the minimal element doesn't help us here, since it's likely to be the first element eliminated in the merge.

At least, the merge operation would then just have to find some least element, and not also worry about creating a chain from the rest.

So, given a set of subtypes of a type D, we try to find a unique subset with single minimal element. If the set has a single minimal element already, we're done. If not, look at all the minimal elements (which do not have smaller elements in the set). ... erase them all? Erase some? Everything feels arbitrary, and also probably not associative. (MERGE(MERGE(D:>P1, D:>P2), D:>P3) where P2:>P3 too, will drop P1 and P2 from the first merge, then give D:>P3 as result, but (MERGE(D:>P1, MERGE(D:>P2, D:>P3)) would create D:>{P2}:>P3 from the last merge and D:>{P1, P2}:>P3 from the first.).

I don't think it's viable to do a merge on arbitrary types in a way which is predictable commutative and associative, and I do think being that is important.

eernstg commented 1 year ago

@stereotype441 wrote:

every entry in a promotion chain should always be a strict subtype of the previous one

Ah, I didn't know that.

But ... you can get ... nonsense like [an example involving List<dynamic> and List<void>]

Sounds like it should not be a breaking change to fix that, and it's really not useful to support those non-strict promotion chains.

I think this is missing an important step ... [we must filter out types that aren't supertypes of most recent promotion in every branch]

Ah, of course, the promotion would not be sound if we forget about that! My bad.

@lrhn wrote:

And our merge should preferably be associative, so this generalizes to merging N individual unrelated chains. If we merge D :> P1 :> P2 and D :> P3 :> P4, should it give the same result as merging D :> P1, D :> P2, D :> P3 and D :> P4? Probably, because merging D :> P1 and , D :> P2 should give D:> P1 :> P2.

I agree that the operation should be associative, and also commutative, because there's nothing sequential about, e.g., the different paths of control flow through the bodies of a switch statement (only the patterns and guards are known to be executed in a linear manner).

However, we can't merge D :> P1 and D :> P2 and get D:> P1 :> P2: This would be unsound, because all we know is that the variable (say, x) refers to an object of type P1 when we reach the end of the first branch which is being joined, and x is P2 at the end of the other branch. But in that case we only know that x is P after the join, because we just know that we've taken one or the other branch.

@stereotype441's filtering step would handle this: At first, we filter [D, P1] based on P2 and get [D, P1] (because we're assuming P2 <: P1), then we filter [D, P2] based on P1 and get [D], and then we compute the longest path in the graph with nodes {D, P1} and get [D, P1].

I don't think it's viable to do a merge on arbitrary types in a way which is predictable commutative and associative

I think it would actually work: The operation that I proposed, corrected to have the filtering step that @stereotype441 mentioned and generalized to have k operands, goes as follows (where pj includes the declared type as the last element for all j in 1..k):

To compute p1 ^^ p2 ... pk, let Tj be the head of pj, for j in 1..k. Then consider the graph G whose nodes are all elements in pj that are supertypes of Ti for every i != j; the edges connect any two nodes n1 and n2, in that order, if and only if n1 <: n2. The result is then the longest path in this graph. If there are multiple paths with the same maximal length then the result is the path which is minimal, if any; if no path is minimal then the result is p1 ^ p2 .. pk.

lrhn commented 1 year ago

However, we can't merge D :> P1 and D :> P2 and get D:> P1 :> P2: This would be unsound,

ACK, my bad. We'd get D :> P1 out of it, because that's what we do know.

Still, assume D :> P1 :> P2 and P1 :> P3, but no relation between P2 and P3.

Then (D :> P1 ^^ D :> P2) ^^ D :> P3 → D :> P1 ^^ D :> P3 → D :> P1, but D :> P1 ^^ (D :> P2 ^^ D :> P3) → D :> P1 ^^ D → D, so still not associative.

(This is the best any sound binary join, based only on the incoming chains, can do with the given type hierarchy. We can do worse and end up with D in the first case too, and that's where we are today.)

The maximal length path among the chain elements does have the advantage that it will contain the intersection of those chains. But looking at the above, I don't believe it can be associative.

eernstg commented 1 year ago

I think the operator that accepts k operands for any k >= 0 will suffice: The basic task handled with ^^ is always joining a set of branches corresponding to a specific location in the source code. We won't ever need to find p1 ^^ (p2 ^^ p3) or (p1 ^^ p2) ^^ p3 at any specific location, we will always compute p1 ^^ p2 ^^ p3 directly.

We can encounter cases where we compute something with a structure like (p1 ^^ p2) ^^ p3 because we have nested source code structures: (p1 ^^ p2) could be the promotion chain obtained from a given switch statement S1, and that result could be used in a computation of (p1 ^^ p2) ^^ p3 because S1 occurs as the body of one case in another switch statement S2. But the parenthesis structure is not arbitrary in this case, we can see that S1 is nested inside S2, and hence it is not a problem that we choose (p1 ^^ p2) ^^ p3 rather than p1 ^^ (p2 ^^ p3).

Commutativity is a nice property—it's very good to know that there is no need to reorder the cases in a switch just because that would yield better typings after the end of the switch. The k-ary operator will obviously deliver the same result for any ordering of the operands, so that property is a given.

In the end, I suppose we could say that associativity isn't important for ^^.

lrhn commented 1 year ago

Just because the structure is statically known, it doesn't mean it's obvious to readers and authors.

Example, if/else if chains:

D v = ...;
if (v is P3 && test3(v)) {
   ....
} else if (v is P2 && test2(v)) {
   ...
} else if (v is P1 && test1(v)) { 
   ...
} else {
  return false;
}
/// is `v` `D` or `D:>P1` here?

Syntactical grouping-wise, this is if (v is P3 && ..) { ... } else { if (v is P2 && ..) { ... } else { if (v is P1 && ..) { ... } else { return false; } } }, so we'd end up doing binary merging, we'll join P2 with P1 first, then P3 with P1, and get D :> P1 (so, yey, lucky).

But I can reorder the tests and get a different result:

D v = ...;
if (v is P1 && test1(v)) {
   ....
} else if (v is P2 && test2(v)) {
   ...
} else if (v is P3 && test3(v)) { 
   ...
} else {
  return false;
}
/// is `v` `D` or `D:>P1` here?

It's not entirely the same because v is P1 will match if v is P2 or P3 too, but assume the test1(v) means that the two other cases are not unreachable. With that ordering, using binary joins, we get a v with unpromoted type D.

I don't expect users to read an if/else if chain and consider which order the distinct branches' promotions are joined in. They are clearly separate and mutually exclusive paths, which we should merge as one at the end.

Meaning: It's not obvious that syntactic structure is the best guide to when to use multi-way join. An associative binary join does not have that issue.

eernstg commented 1 year ago

Just because the structure is statically known, it doesn't mean it's obvious to readers and authors.

💯 !

However, we probably can't avoid this anomaly entirely, and I think the k-ary operator that I mentioned would be a reasonable approach.

munificent commented 1 year ago

This is way outside of my area of expertise, so maybe this isn't helpful. But are all the real-world examples where this is annoying related to nullable types? If so, maybe the easy/hacky fix is that whenever you promote from a nullable supertype to a non-nullable subtype, you implicitly split that into two steps in the promotion chain? I'd probably to the nullable->non-nullable first. So in the original example:

void f(num? x) {
  switch (x) {
    case final a when a is int:
    case final a when a is num:
      a; // (1)
  }
}

The promotion chains are then num? :> num :> int and num? :> num. And now you have the expected common type num.

lrhn commented 1 year ago

Taking two commutative steps means potentially taking the wrong one first. The promotion chain could also be num? :> int? :> int, and then there is no shared num. And if the other branch was num? :> int?, that would be the useful chain.

We'd have to choose one, and hope it's the right one.

I keep thinking that there might be some way to merge chains in a null-aware way, but I can't come up with one which breaks symmetry.