Open stereotype441 opened 1 year ago
CC @munificent @leafpetersen @jakemac53 @lrhn @mit-mit @natebosch @eernstg @kallentu
I believe regular variable declarations use option 1 (?
):
main() {
List list = [1, 2, 3];
print(list.runtimeType);
}
This prints List<dynamic>
. I think we probably want to do the same thing for patterns, but I'd like to hear from folks who know the type system better than I do.
I believe regular variable declarations use option 1 (
?
):main() { List list = [1, 2, 3]; print(list.runtimeType); }
This prints
List<dynamic>
. I think we probably want to do the same thing for patterns, but I'd like to hear from folks who know the type system better than I do.
If I'm not mistaken, regular variable declarations use option 2 (instantiate to bounds), which is why this prints List<dynamic>
. It instantiates List<E>
using the bound of E
, which is dynamic
; then it uses List<dynamic>
as the context for analyzing [1, 2, 3]
. At downward inference time, we generate the constraint T <: dynamic
, which causes dynamic
to be chosen for the list element type before the list elements are even visited.
If the context had been List<?>
, then downward inference would not have generated any constraint (we don't generate constraints of the form T <: ?
), and so the type would remain unknown until upwards inference time. At that point the presence of the 1
, 2
, and 3
would cause the constraints int <: T
to be generated, at which point int
would be chosen for the list type.
You can simulate the behaviour for List<?>
using this test program:
dynamic f<T>(List<T> x) => x;
main() {
print(f([1, 2, 3]).runtimeType);
}
Which prints List<int>
.
Take:
class Box<T> {
T value;
Box(this.value);
}
and
var Box(value: int x) = Box(1);
The type schema for the LHS will be either Box<?>
(not inferred yet), Box<dynamic>
(instantiated to bound) or Box<int>
(inferred from the type schema for value
, which constrains T
).
Do we get the third option? If not, can we?
If we don't infer Box<int>
, I'd prefer to use Box<?>
as the context type schema of the RHS.
If we use Box<dynamic>
as context type, we will make the RHS be Box<dynamic>(1)
. If we use Box<?>
, it becomes Box<int>(1)
, which is actually matched by the LHS.
Then we come back to the LHS for type checking. If we get Box<int>
as matched value type, it will work, otherwise it won't.
So, using dynamic
makes that line fail to compile, even if it looks very reasonable. Using ?
makes it work. Inferring int
works too.
Using int
will be much better. Take:
var Box(: double value) = Box(0);
If we get Box<double>
as RHS context type schema, then this works.
So, can we define the schema extraction for an Object
patterns as:
?
.Very handwavy. Does it seem possible?
Thanks @lrhn for your very astute and precise inside-the-Box
thinking. Those examples are really compelling.
So, can we define the schema extraction for an
Object
patterns as:
- Let C be the declaration that the object name resolves to.
If C is generic, and the object pattern does not provide all type arguments (as actual type arguments, or by unwinding type aliases),
Find the type schemas for each field pattern of the object pattern.
Match them up against the declared types of the getters invoked by the field patterns.
Feel free to ignore those whose type does not involve a type variable.
Try to "solve for the type variables."
If successful, use the inferred type as type argument of C for that type variable.
If not, use
?
.- Then the type schema for the object pattern is the thus instantiated C.
Very handwavy. Does it seem possible?
Yeah, I think it's possible. In fact, I think it's not too hard to get rid of the handwaving. The core operation we use to create type inference constraints is "try to match U
as a subtype of V
", where either U
or V
can refer to the type variables being solved for (but not both). (The actual spec is a little wordier than this--see Subtype constraint generation for details.) So the logic would be:
X
. If X
contains type variables, and the object pattern does not supply type arguments, then:
C
.p
, with name n
and subpattern f
:f
and denote it S
.G
be the type of the getter on X
(expressed in terms of the type variables).G
as a subtype of S
, accumulating the resulting type constraints into C
.X
using the constraint solution for C
.Note that the constraint solution for C
doesn't pin down all the type variables; just the ones for which constraints were generated.
Which is the behaviour we want here. It also does all the right things to account for bounds.
The logic in the analyzer and CFE for doing all this is rather crufty, so it won't necessarily be easy. (But I've made several attempts to tidy it up over the years, and I'm not afraid to try again!)
I do wonder a little bit if the result would be useful enough to justify the implementation effort, but on the other hand it's so much cheaper to get this sort of thing right the first time rather than fix it later (because you don't have to fix code that's out there in the wild relying on the old inference behavior).
If I'm not mistaken, regular variable declarations use option 2 (instantiate to bounds), which is why this prints
List<dynamic>
.
Ha, I am clearly out of my element here.
The type schema for the LHS will be either
Box<?>
(not inferred yet),Box<dynamic>
(instantiated to bound) orBox<int>
(inferred from the type schema forvalue
, which constrainsT
).Do we get the third option? If not, can we?
This is an excellent point. I think we should get the third option if possible. In particular, because that is what we do for list and map patterns that don't have explicit type arguments: we infer from them the subpatterns. So there's already an ad hoc bottom-up type schema inference going on for two of the destructuring patterns, and it seems like we should really generalize that to object patterns too. It would be unfortunate if these caused different types of lists to be instantiated:
var [int x, ...] = [1];
var List(first: int x) = [1];
It looks to me essentially like the bottom-up inference we do for generic method invocations based on the argument expressions.
This is an excellent point. I think we should get the third option if possible. In particular, because that is what we do for list and map patterns that don't have explicit type arguments: we infer from them the subpatterns. So there's already an ad hoc bottom-up type schema inference going on for two of the destructuring patterns, and it seems like we should really generalize that to object patterns too. It would be unfortunate if these caused different types of lists to be instantiated:
var [int x, ...] = [1]; var List(first: int x) = [1];
Oh yes, good point, I would really like for type inference to treat those two in the same way as well.
It sounds like we have a fairly good consensus, but I'm interested in hearing from @leafpetersen about whether he thinks this amazing awesome functionality is going to be worth the extra implementation effort. I think it's worth it, but I know that I'm prone to biting off more than I can chew, and Leaf is the person who usually talks sense into me when I'm about to take on too much work. Assuming he doesn't try to talk me out of it, I'll plan to write up a PR adding this to the patterns spec, and add some language tests.
Sorry, I dropped the ball here. I'm fine with this, I think it makes sense. If @stereotype441 is still interested in pursuing it, I'm happy to review a PR.
I think it would be pragmatically very desirable to compute a context type schema for a pattern which is as informative as it can be, based on subpatterns. That is, it would be very nice if we could do as @stereotype441 specifies here.
However, there's a discrepancy with the treatment of "half" types in regular (non-pattern) declarations, as mentioned by @munificent here: In that case we will use instantiation to bound to obtain the missing type arguments.
A third situation where we have considered using inference rather than instantiation to bound is type tests: x is List
.
I think it would be great if we could make these situations converge as much as possible. Here are a few potential scenarios:
List xs = [1];
will turn List
into List<int>
rather than List<dynamic>
).List xs = [1];
as well as x is List
).I certainly think that (1) and (2) are much better than (3). Developers have been asking for something like (1) with regular declarations, and they've been surprised that we didn't already do that. But (4) is actually even worse than (3), because developers won't get a suitable hint about the discrepancy, and it really does matter whether your list is List<int>
or List<dynamic>
.
It also seems useful to be able to say, e.g., that the desired declared type is "some kind of Iterable
" using Iterable xs = [1];
: This would make the declared type Iterable<int>
, which may be more appropriate than List<int>
in the given situation, and it can't be expressed unless the type Iterable<int>
is specified explicitly and in full (that is, including <int>
).
The patterns proposal currently says this, in the "Pattern context type schema" section:
If the type being referred to is generic, but the pattern doesn't specify type arguments, what should the context be?
I can think of two possibilities that might make sense:
?
Here's an example where it makes a difference:
If we do option 1 (fill in all the type arguments with
?
), the contextC<?>
will be used to inferC([1, 2, 3])
; consequently the contextList<?>
will be used to infer[1, 2, 3]
, so the list that's created will be aList<int>
.If we do option 2 (instantiate to bounds), the context
C<dynamic>
will be used to inferC([1, 2, 3])
; consequently the contextList<dynamic>
will be used to infer[1, 2, 3]
, so the list that's created will be aList<dynamic>
.My gut says we should go with option 1 but I'm not certain.