Closed sstucki closed 1 year ago
Concretely, here we want all the lambdas to have the (wider) kind (Y <: String) -> * which is the declared kind of L in the parent trait B:
Can you extend this to the case where the upper and lower bounds of L have different kinds ? e.g.:
trait A {
type L >: [Y <: Any] => Any <: [Y <: String] => Any
}
Yes, you can. But the two bounds need to have at least the same simple kind, i.e. the same number of arguments. It doesn't work if one bound is, say, Int
and the other is a type lambda.
In your example, the common kind could be
[Y <: Any] => Any :: (Y <: String) -> *
[Y <: String] => Any :: (Y <: String) -> *
But that's not the only option, e.g. (Y <: Nothing) -> *
would work as well.
Whether or not there's always a comon kind depends on how liberal you allow bounds to be. For example, the type lambdas [X >: Nothing <: Nothing] => Any
and [X >: Any <: Any] => Any
only have a common kind if you allow absurd/empty bounds: (X >: Any <: Nothing) -> *
.
Does that answer your question.
Yes, makes sense.
Concretely, here we want all the lambdas to have the (wider) kind
(Y <: String) -> *
which is the declared kind ofL
in the parent traitB
:
But in A, L takes Y <: Int
. Is that a typo? How should A
look like I have further questions related to how A
fits in the rest... Should I just remove A
everywhere to read your post?
Then we need to check that the definitions of the lambdas agree in this kind
What if the body of one of the lambda is not well-formed under the widened kind ? e.g. given:
class Foo[X >: String]
I may want to check:
[X >: Any] => Nothing <: [X >: String] => Foo[X]
Which according to the second typing rule you give means checking:
G, X >: Any |- Nothing <: Foo[X]
Which doesn't type-check. Given that the compiler only fully check bounds of applications after typechecking is done (I think mostly to avoid completion cycles), that seems problematic.
Wait no, my example is wrong and does type-check, nevermind, it's too late for me to think about this properly :). But maybe you can tell me if the body of the lambdas are always guaranteed to typecheck with the widened kind.
Here's a better example of what I have in mind, given:
trait A {
type L >: [X] => Nothing <: [X] => Any
def foo: L[Int]
}
I think your rules allow me to write:
trait B extends A {
type L = [X <: String] => Any
}
But then the type of foo
in B
is L[Int]
which reduces to ([X <: String] => Any)[Int]
which doesn't typecheck, that seems bad to me.
@smarter
What if the body of one of the lambda is not well-formed under the widened kind ?
That can't happen: a wider lambda-kind gives narrower bounds, which makes more things typecheck (by Liskov substitution principle). So your example would be a problem, but the rules should allow contravariant bound refinement (just like for methods).
So, IIUC, this issue is about allowing the "reverse" of your example, as follows:
trait A {
type L >: [X <: String] => Nothing <: [X <: String] => Any
def foo: L[String]
}
trait B extends A {
type L = [X] => Any // this has kind `(Y <: Any) -> *`, which can be widened to the expected `(Y <: String) -> *`, so this is a valid definition
}
==
In Sandro's example, kind (Y <: String) -> *
is wider than (Y <: Any) -> *
, because Any
is wider than String
and function kinds are contravariant (just like function types).
And any lambda body that typechecks in Y <: Any
also typechecks in Y <: String
(that's part of the LSP — in particular, that should be the narrowing lemma).
To be clear, \forall
is contravariant and \lambda
isn't. Function types/kinds are contravariant \forall (x: S). T
is contravariant, and the kind-level \forall (x <: T) -> K
is as well.
I don't get it. The original post says:
G, X >: (L1 | L2) <: (U1 & U2) |- T1 <: T2
-----------------------------------------------------
G |- [X >: L1 <: U1] => T1 <: [X >: L2 <: U2] => T2
From this I get [X] => Nothing <: [X <: String] => Any
and [X <: String] => Any <: [X] => Any
, so https://github.com/lampepfl/dotty/issues/6320#issuecomment-483880214 should compile. What's the step I'm missing exactly ?
But in A, L takes
Y <: Int
. Is that a typo? How shouldA
look like I have further questions related to howA
fits in the rest... Should I just removeA
everywhere to read your post?
No, it's not a typo. But I had the same reaction as @Blaisorblade when copy-pasting the code from the old issue. What follows really only addresses the error reported by the compiler about incompatibility with B
.
At first I thought, once the error w.r.t. B
is resolved, there should be a similar check for A
, which, following the same reasoning, should pass as well. But the common kind for that check would be different. So I'm actually not sure whether the whole example should pass or not. I guess it depends on how mixins work. For example, is it OK to find a common kind for all three declarations of L
that isn't the declared kind of L
in either A
or B
? The kind (Y <: Int & String) -> *
would work, but it's an upper bound, rather than a lower bound of the kinds of L
in A
and B
, respectively. That's probably not safe. We probably have to "fix" the kind of L
in both A
and B
before performing the check. If we fix it to (Y <: Int & String) -> *
in both traits, then everything works out, but if we fix different, incompatible kinds in A
and B
the code should not pass the typer. Does that make sense?
In fact, I think this is related to @smarter's comment:
From this I get
[X] => Nothing <: [X <: String] => Any
and[X <: String] => Any <: [X] => Any
, so #6320 (comment) should compile. What's the step I'm missing exactly ?
This is very interesting! You may just have found a counter example to the second approach. You're right, the bounds check out, but what is happening is that they only do so in the wider kind that L
has in B
rather than the kind L
has in A
. It all makes sense once we make kinds explicit (essentially following the first approach).
In A
, the type L
has kind (X >: Nothing <: Any) -> *
(or * -> *
for short). In principle, it could also have kind (X >: Nothing <: String) -> *
but then the application L[Int]
would be ill-kinded, so we really want it to have the narrower kind * -> *
for the entire trait definition to work.
Now in B
, the type L
suddenly acquires a wider kind (X >: Nothing <: String) -> *
. It cannot have kind * -> *
because that doesn't fit the upper bound String
on X
. But then there is a problem: a type member does not get to widen it's kind in a refinement!
So how come solution 2 did not show this? Well, because it automatically finds the narrowest kind that fits both lambdas being compared rather than taking an expected kind as an input. Here, this kind is (X >: Nothing <: String) -> *
. The comparison [X <: String] => Any <: [X] => Any
only makes sense in (X >: Nothing <: String) -> *
, which means the bounds only check if we assume L
already had kind (X >: Nothing <: String) -> *
in A
, which contradicts well-kindedness of L[Int]
.
I have no clue how this issue with solution 2 could be fixed without passing an expected kind. Any ideas?
I guess it depends on how mixins work.
After typechecking, we'll check that the type definition is within bounds of the definitions in each mixin, one by one.
I have no clue how this issue with solution 2 could be fixed without passing an expected kind. Any ideas?
Nope, I was hoping you'd have some magic solution :)
Going back to solution 1, I'm not sure how being kind-directed would work in practice. How do you decide which kind to use in the first place ?
I guess it depends on how mixins work.
After typechecking, we'll check that the type definition is within bounds of the definitions in each mixin, one by one.
OK, so there is no linearization order of something? If these checks should be driven by some expected kind from the parent traits, then a common one would have to be found first. For example, one could pick the first one (here A
?) and check the second one against it (here, check that the kind of L
in B
conforms to the kind of L
in A
). Alternatively try to find the GLB of the kinds of L
form both A
and B
and use that as an upper bound to check the kind of L
in C
.
What would you do if L
was an abstract method rather than an abstract type constructor? I'm sure there's some way to determine the expected type of a method in a mixin composition, no?
I have no clue how this issue with solution 2 could be fixed without passing an expected kind. Any ideas?
Nope, I was hoping you'd have some magic solution :)
One possibility I see is to have a flag in the subtype checker to indicate whether you want to check from above or below. When checking A <: B
, the "above" version would assume the common kind to be the kind of the RHS B
and the "below" version would use the LHS A
to drive checking. This would result in two versions of the rule from solution 2 above: one using the bounds of the LHS when extending the context, and one using the bounds of the RHS. But I'm not sure this could be made to work consistently for the entire subtype checker. E.g. any "rules" that make use of some form of transitivity would result in some sort of narrowing of the implicit expected kind. So this probably wouldn't work.
Going back to solution 1, I'm not sure how being kind-directed would work in practice. How do you decide which kind to use in the first place ?
Well, it depends on where you are performing the subtype check, and it may not be obvious in all situations. But a good rule of thumb would be that you do something similar to what you do when you're using expected types (e.g. for checking definitions). For example, in the above case, you somehow manage to figure out the right type to use when you compare refinements of method signatures. If you have traits
trait A { def foo(x: Int): Any }
trait B { def foo(x: String): Any }
trait C extends A with B { def foo(x: Nothing): Any }
this somehow works out, so you have to find a common "expected type" for foo
to check against in C
, no?
this somehow works out, (...)
Whoops, this does not do what I thought it would. I thought that foo
in C
would be considered a refinement of the abstract methods foo
in A
and B
but apparently they are just overloaded. So type members are treated differently from methods after all, it seems.
OK, so there is no linearization order of something?
In the body of trait C extends A with B
, we see the definition of L
coming from B
due to linearization, but in the end, the definition of L
in C
must override the one in A
and the one in B
(after all, C <: A & B
), this is checked in a phase after typechecking.
But a good rule of thumb would be that you do something similar to what you do when you're using expected types
I hope we can formalize this somehow :).
Whoops, this does not do what I thought it would. I thought that foo in C would be considered a refinement of the abstract methods foo in A and B but apparently they are just overloaded. So type members are treated differently from methods after all, it seems.
You can refine in a override the result type of a method covariantly, but you can't refine the parameter types contravariantly, that results in overloads indeed.
I hope we can formalize this somehow :).
So do I :-)
I have an analogy from mathematics that might help clarify the situation a bit and that also suggests a possible way forward. (It's based on a similar analogy I mention in #6369.)
Imagine we want to prove some abstract theorem about real numbers and real-valued functions. The theorem doesn't hold for all reals though, only for those who satisfy a set of inequalities. So in our theorem we might say things like "assume a positive real number x ..." which is just another way of saying "assume a real number x such that 0 ≤ x". When we define a trait with some bounded abstract type members, we are doing the same thing, but for types. The declaration
type X >: Int
just says "assume a type X
such that Int <: X
" – so far so good.
Going back to our math analogy, assume we want to state a theorem about some abstract real-valued function f: ℝ → ℝ which is bounded by some other functions, e.g. g(x) = 0 and h(x) = 2 x for every x ∈ ℝ. This is analogous to declaring an abstract type operator with a lower and upper bound, e.g.
type F[X] >: G[X] <: H[X]
So what about bounds on the arguments? Assume our theorem is a bout an abstract function f: ℝ → ℝ that is lower-bounded by g(x) = log(x). We know that the function g is not defined for all reals, only for the strictly positive ones. In other words, the argument of g is itself lower-bounded! If we were to use some sort of pseudo-Scala syntax, we could write it's definition as g(x > 0) = log(x). With that in mind, what does it mean when we ask that f be lower-bounded by g? It means that f(x) ≥ g(x) for all positive reals. When we compare f and g it simply does not make sense to ask that f(-1) ≥ g(-1) because g is ill-defined at -1, even when f itself is defined for all the reals. So by saying that f ≥ g for this particular choice of g, we're implicitly constraining the domain where we can say meaningful things about f. Similarly, when we write a declaration like
type F >: [X <: String] => ...
we restrict the domain where we can say meaningful things about F
to subtypes of String
.
With that in mind, let's revisit your earlier example:
trait A {
type L >: [X] => Nothing <: [X] => Any
def foo: L[Int]
}
trait B extends A {
type L = [X <: String] => Any
}
This is a bit like stating a theorem A
with a premise like "assume a function ℓ: ℝ → ℝ such that -∞ ≤ ℓ(x) ≤ ∞, and let foo = ℓ(-1), ...", which is fine because, by assumption, ℓ is defined on all the reals. Then you try to apply your theorem with ℓ(x) = log(x). At first sight, this seems OK because -∞ ≤ log(x) ≤ ∞, except that this is only true for positive reals. And the definition of foo does not make sense because log(-1) is ill-defined. The problem is not the bounds check, it's that log doesn't have the expected type ℝ → ℝ, it has type { x ∈ ℝ| x > 0 } → ℝ. Similarly, the definition of L
in B
fails not because it doesn't conform to the bounds in A
, but because the kind doesn't match. When we test whether a definition or refinement of an abstract type member conforms to it's declaration, we need to take the kind of the original declaration into account, not just the bounds.
So how to solve this? I see two obvious approaches:
A
), or(X: Nothing..Any) -> Nothing..Any
),and enforce that kind, irrespective of how it came about, in refinements. The problem with (1) is that Scala doesn't have a kind language, the problem with (2) is that it will likely break a bunch of code where a type definition uses a bound, but its declaration in a super-class/-trait does not (like in the above example).
This is a really useful analogy, thanks !
the problem with (2) is that it will likely break a bunch of code where a type definition uses a bound, but its declaration in a super-class/-trait does not (like in the above example).
Right, so there's a trade-off here. If we were to go with (2) would the original example involving mixins from https://github.com/lampepfl/dotty/issues/6320#issue-433841780 still work ?
If we were to go with (2) would the original example involving mixins from #6320 (comment) still work ?
Good question. I think it should. By analogy, if Lemma A holds for functions on positive reals, and Lemma B holds for functions on odd numbers, then the conjunction (i.e. the intersection A&B) should hold for any function defined on positive reals and odd numbers. In particular, it should hold for functions on arbitrary reals as well.
What's less obvious to me is whether the kind of L
in C
should be (X <: Int | String) -> (Int | String)..Any
, as dictated by the mixin, or the more precise kind (X <: Any) -> Any..Any
, as suggested by the annotation in C
.
@sstucki On your last Q, I'd still go with (2). Except that, it seems to me, inferring the least interval kind means we can do away with expected kinds, and we should just check that the info
of a type declaration, that is, its interval kind, is a subkind of the intersection of the parent declarations. That should reject @smarter's counterexample, accepts the example in the OP, and follows Gist HK-DOT.
Notation: following that Gist, I'll sometimes write interval kinds L..U
instead of pair of lower and upper bounds >: L <: U
, even tho interval kinds can be translated into Scala using bounds.
Translating the surface syntax from HK-DOT to Scala, we get the following variant of the OP (1), which combines HK-DOT (K-..-<::-..) and (K-→-<::-→):
G, X >: L2 <: U2 |- T1 <: T2
G |- [X >: L2 <: U2] <:: [X >: L1 <: U1]
----------------------------------------------------------------------------- (K-..→-<::-..→)
G |- [X >: L1 <: U1] => T1 <: [X >: L2 <: U2] => T2
where G |- [X >: L2 <: U2] <:: [X >: L1 <: U1]
is alternative syntax for G ⊢ L₂..U₂ <:: L₁..U₁
and can be replaced by G |- U2 <: U1
and G |- L1 <: L2
.
Do we have an example where this rule differs from what you propose?
I don't have a mathematical metaphor, and I can't summarize yours, but here's the naive set-theoretic semantics I have in mind: for each arity, kinds map to set of "types", and subkinding to a subset relationship.
What's less obvious to me is whether the kind of L in C should be (X <: Int | String) -> (Int | String)..Any, as dictated by the mixin, or the more precise kind (X <: Any) -> Any..Any, as suggested by the annotation in C.
I think it should be the more precise kind (X <: Any) -> Any..Any
. Kinds of type members are how type definitions are exposed (in ML with singleton kinds, and in Gist HK-DOT), and defining a type refines its kind. See also below for how this example then works out.
But when checking whether two lambdas are subtypes, their bounds need not be compared, as long as both lambdas have a common kind.
[...]
Similarly, the definition of
L
inB
fails not because it doesn't conform to the bounds inA
, but because the kind doesn't match. When we test whether a definition or refinement of an abstract type member conforms to it's declaration, we need to take the kind of the original declaration into account, not just the bounds.
Honestly, it seems to me that if you infer the least kind and compare that, it's enough to compare the bounds directly. Except that "compare the bounds" also includes the comparison of type argument bounds, that you move to the comparison of kinds.
So, in the example in the OP, the inherited "kind"/bound pair/info for L
should be the intersection [Y <: String|Int] => (String|Int)..Any
of parent kinds [Y <: String] => String..Any
and [Y <: Int] => Int..Any
. The kind of C.L
is [Y <: Any] => Any..Any
, which is a subkind, hence the example should be accepted. And that should be the "exposed" kind of C.L
.
@smarter counterexample is rejected, because [X <: String] => Any..Any <:: [X] => Nothing..Any
would require the false premise Any <: String
.
EDITs: formatting, fixing typos.
@sstucki Comparing the rule I just proposed (K-..→-<::-..→)
with the 1st in the OP, a derivation of (K-..→-<::-..→)
can be transformed into
G, X >: L2 <: U2 |- T1 <: T2
G |- [X >: L1 <: U1] => T1 :: (X >: L2 <: U2) -> Nothing[T2]..Top[T2]
G |- [X >: L2 <: U2] => T2 :: (X >: L2 <: U2) -> Nothing[T2]..Top[T2]
-----------------------------------------------------------------------------
G |- [X >: L1 <: U1] => T1 <: [X >: L2 <: U2] => T2 :: (X >: L2 <: U2) -> Nothing[T2]..Top[T2]
under some (I think reasonable) assumptions.
@Blaisorblade, I'll start by summarizing your proposal as I understand it – correct me if I missed anything:
C
against the one in its super-class/-trait B
just compare the inferred kinds of the two declarations: we should have KC <:: KB
.K
are really just type intervals K = L..U
, so we may as well simplify (1) and just check that LB <: LC
and UC <: UB
.TLDR: I agree with (1) but not with (2) and (3). Explanations and counter examples follow.
There's not much to be said about (1), that is essentially what I advocate. It is consistent with the subset of Scala that I have studied and proven safe.
The problem with (2) is the supposed equivalence between kinds and type intervals. It's very natural and very appealing: it suggests that we don't need to introduce a separate kind language because we can just keep track of lower and upper bounds, which are types. It also seems consistent with the trick Pierce uses in TAPL, where kinds are encoded through their extremal types (kind K
is represented by Top[K]
).
The problem is that this only works if you keep track of both ends of the interval, the lower and the upper bound, at all times, and enforce their consistency. If you don't do that, the bound annotations in lambdas may get out of sync, and then you suddenly don't know which one you should enforce. In particular, when you compare two function intervals (intervals where both bounds are functions), there are a total of four pairs of bound annotations involved, and picking the right one will depend on whether your comparing the upper or lower bounds of the two intervals. This is also where (3) breaks down because it is not the case that bound annotations in lambdas should always be compared contravariantly.
Let me illustrate this once again with the maths analogy: let's say we have two abstract functions, each bounded by some concrete functions:
Note that there are no signatures, we're only characterizing the functions via their bounds. Now suppose we want to instantiate these two functions to
Note the change of bounds annotations (i.e. domain of definition). Which of these instantiations are OK? Intuitively, it's clear that (1) is problematic: we cannot use a function defined only on the positive reals where one defined on all reals is expected. The converse is possible though, so (2) is fine.
But what does our bounds-only algorithm say? To check that, we have to compare the bounds of all the functions, including their bound annotations. We have to
where I have omitted upper bound annotations for readability. You can already see the problem: in each case, we need to compare lambdas with wider bound annotations to those with narrower bound annotation and vice versa. If we simply always pick the annotation from the LHS or RHS lambdas (i.e. check lambdas covariantly or contravariantly), then at least one check in (1) and (2), respectively, will always fail, and both instantiations are rejected.
If you are sceptical of this type of analogy, here's a Scala example that exhibits the same issue:
trait B {
type F[X <: Int] >: Int <: Any
type G[X] >: Int <: Any
type H1[X >: Int] >: Int <: Any
type H2[X >: Int] >: Int <: Any
}
trait C extends B {
type F[X] = Any // should be OK
type G[X <: Int] = Any // should fail
type H1[X >: Int] = X // should be OK
type H2[X] = X // should be OK
}
Scala 2.12 seems to get F
, G
and H1
right, Dotty accepts only H1
.
So why does the kind-based approach (1) not suffer from this issue? In the kind language, there is only one domain kind (and therefore only one bounds annotation) for function kinds and hence bound annotations cannot get "out of sync". The kind-based approach does all the same bound checks as the interval-based one you suggest, but there's no confusion which bound annotations to use when doing so.
The problem is that this only works if you keep track of both ends of the interval, the lower and the upper bound, at all times, and enforce their consistency. If you don't do that, the bound annotations in lambdas may get out of sync, and then you suddenly don't know which one you should enforce.
Indeed, time ago Martin mentioned that this is not done and should ideally be changed, but it's somewhat invasive. And your examples of regressions from 2.12 probably raise the priority of this issue a bit.
This is also where (3) breaks down because it is not the case that bound annotations in lambdas should always be compared contravariantly.
Does 3 break down also if you keep the bounds in sync? It seems not...
@sstucki Here's an updated tentative proposal — as concrete as possible. Rationale coming separately. @smarter As @sstucki says, type parameters in upper and lower bounds should be kept consistent. Scala2 seems to enforce that, Scala3 doesn't, and that's a problem. Here's a fixed proposal, closer to Gist HK-DOT, together with problematic examples.
Scala 2 forbids bound mismatch in upper and lower bounds: you can write
object Foo1 { type T[X] >: Int <: Any }
but not
object Foo2 { type T >: [X] => Int <: [X <: Int] => Any }
Scala 3 supports Foo2
both internally and as user code, but HK-DOT forbids it, and Foo2
is pretty problematic for this issue (what kind should be inferred for things like Foo2
?). It rejects Foo3
if written directly:
object Foo3 { type T >: [X] => Int <: [X <: Int] => [Y] => Any }
but Foo3
can be represented internally, and IIRC obtained indirectly.
As mentioned, Martin finds at least Foo3
problematic; the fix he proposed would use HK-DOT's representation of kinds and also forbid Foo2
.
As Foo2
is not supported by Scala2, we have no backward compatibility requirements. Moreover, IIUC, nobody has asked support for Foo2
. Hence, we might want to add an issue to outlaw it. We might consider adding some syntax for type T :: [X] => >: Int <: Any
or type T :: [X] => Int..Any
, to avoid repeating bounds that must coincide; or outlaw lambdas in bounds (but support currying on the LHS).
Based on Gist HK-DOT, so intervals can only be formed at kind *, which prevents applying (K-..-<::-..)
to compare lambdas (which, as you explain, gives the wrong results).
G, X >: L2 <: U2 |- K1 <: K2
G |- [X >: L2 <: U2] <:: [X >: L1 <: U1]
----------------------------------------------------------------------------- (K-..→-<::-..→')
G |- [X >: L1 <: U1] => K1 <:: [X >: L2 <: U2] => K2
Γ ⊢ S₂ <: S₁ Γ ⊢ T₁ <: T₂
---------------------------- (K-..-<::-..)
Γ ⊢ S₁..T₁ <:: S₂..T₂
Concretely, the fix that Martin hinted at would represent internally [X] => >: L <: U
, that is (more or less) HKTypeLambda(X, TypeBounds(L, U))
, instead of >: [X] => L <: [X] => U
, that is (more or less), TypeBounds(HKTypeLambda(X, L), HKTypeLambda(X, U))
.
That fix would be invasive, but would not require using an expected kind in the subtype checker, or the addition of a subkind checker. I am not sure which change is less invasive. It might be useful to know if this proposal is equivalent to passing an expected kind during subkind checking, as @sstucki proposes.
The problem is that this only works if you keep track of both ends of the interval, the lower and the upper bound, at all times, and enforce their consistency.
@sstucki It seems that's what Gist HK-DOT (tries to) do, and IIUC succeeds at?
Kinds K are really just type intervals K = L..U, so we may as well simplify (1) and just check that LB <: LC and UC <: UB.
I see where that goes wrong: that only works at "kind" *, not at higher kinds.
And in particular, (x: T1) => L1..U1 <:: (x: T2) => L2..U2
does not require (x: T1) => L1 >: (x: T2) => L2
, but only that (x: T2) => L1 >: (x: T2) => L2
— that's why we can't use (K-..-<::-..)
at higher kinds.
2. check λ(x ≥ 1).-log(x) ≤ λ(x ≥ -∞).0 and λ(x ≥ -∞).0 ≤ λ(x ≥ 1).log(x).
IIUC, to check that the "kind" of g(x) = 0
is a "subkind" of g(x ≥ 1) such that -log(x) ≤ g(x) ≤ log(x)
, rule (K-..→-<::-..→')
(note the prime) asks us to check x ≥ 1 ⊢ 0..0 <:: -log(x)..log(x)
, which is true. IIUC, your example illustrates (K-..→-<::-..→) (no prime); that rule should then be discarded.
If you are sceptical of this type of analogy, here's a Scala example that exhibits the same issue:
I'm happy to use a set-theoretic semantics to justify these rules, but I struggle to compile back and forth from Scala to these examples, and to understand where exactly things go wrong.
Last, let me explain the fixed proposal in your words. 0.
- To check a type declaration in a class/trait
C
against the one in its super-class/-traitB
just compare the inferred kinds of the two declarations: we should haveKC <:: KB
.
Yes. The rule I wrote is for subtype checking, but here in fact we need a rule for subkind checking, that is, (K-..→-<::-..→').
Kinds K are really just type intervals K = L..U, so we may as well simplify (1) and just check that LB <: LC and UC <: UB.
This is rule (K-..-<::-..)
; but, as I hadn't fully realized, that should only happen at kind *
.
To do (3) we need to compare type lambdas, and we should do so with a rule that treats bound annotations contravariantly – in accordance with the subkinding rule for arrow kinds from the HKDOT gist.
We should not use (K-..-<::-..)
at higher kinds, but we should use the adapted subkinding rule for arrow kinds from the HKDOT gist. And there, bound annotations are always treated contravariantly, because we never compare higher-kinded lower bounds.
@Blaisorblade the notation that you are using in (K-..→-<::-..→') is confusing to me. This is a kinding rule, right? But the notation looks a lot like you're comparing type lambdas. But I guess these are dependent arrow kinds? Maybe we shouldn't overload =>
even more than it already is...
Does 3 break down also if you keep the bounds in sync? It seems not...
Yes, (3) breaks down even when you keep the bounds in sync. It's important not to confuse two related but distinct issues here:
The first point is a job for kind checking (or rather, subkind checking), the second point is a question of subtyping. Unfortunately, these seem to be heavily entangled in Dotty at the moment, but they are really not the same thing. The confusion stems from the fact that kinds may be thought of as intervals and subkinding as checking that the bounds of intervals are compatible. I have already explained how this intuition can break down if the kinds of the two ends of an interval are not kept in sync. However, even if subtyping of lambdas is currently used to perform subkind checking, that does not mean fixing the latter somehow fixes the former.
It does not help that my OP uses subkind checking (1) to trigger a bug in subtype checking for lambdas (2). But again, fixing (1) will not fix (2). Obviously both should be fixed.
That being said, it's really hard to come up with actual Scala snippets that trigger subtype checking of lambdas but not some form of kind/subkind checking. I'll have to think a bit more about that. For now, I'll make arguments based on the theory that I'm familiar with.
OK, so back to subtyping lambdas with different bound annotations. You may wonder, why not enforce equal bounds when subtyping lambdas? But that is too restrictive: if you do this, subsumption (for types) in combination with eta/beta-conversion doesn't work as you'd expect. You really want to be able to relate T1 = [X] => Foo[X]
and T2 = [X <: Int] => Foo[X]
because they are beta-eta-equal in the kind (X <: Int) -> *
: you can eta-expand T1
to [Y <: Int] => ([X] => Foo[X])[Y]
, then beta-reduce that type to T2
.
Having well-behaved intervals also doesn't change the fact that a single subtyping rule for lambdas that always picks either the LHS or RHS bounds is too restrictive. E.g. in the above example T1
has a larger domain than T2
but they are mutual subtypes, so whichever rule you pick would fail in one direction. You may say, well let's have both versions of the rule then and switch between them, e.g. depending on whether we compare upper bounds or lower bounds. But it's not clear how that would work, especially once you have to deal with bound annotations that are themselves lambdas. What really determines the bound annotation is the kind/interval within which you are comparing the two lambdas.
The only way I see, for comparing lambdas consistently, is to somehow take into account the common bounds of the lambdas as well, something like
G |- L <: [X: L1..U1] => T1 <: [X: L2..U2] => T2 <: U
where L
and U
would be the overall bounds and we'd need to make sure that their bound annotations are in sync. But then, how is this different from the judgment
G |- [X: L1..U1] => T1 <: [X: L2..U2] => T2 :: L .. U
which, IMO, just looks like a kinded subtyping judgment again. So, it's unclear to me what you'd gain by this.
My impression is that this whole business of mixing up kinding and subtyping looks like a nice shortcut initially, but it is not really simplifying things. If anything, it's actually just making things more complicated in the end.
IUC, to check that the "kind" of
g(x) = 0
is a "subkind" ofg(x ≥ 1) such that -log(x) ≤ g(x) ≤ log(x)
, rule(K-..→-<::-..→')
(note the prime) asks us to checkx ≥ 1 ⊢ 0..0 <:: -log(x)..log(x)
, which is true. IIUC, your example illustrates (K-..→-<::-..→) (no prime); that rule should then be discarded.
I'm not sure why you say the rule should be discarded. This part of the example (instantiation of g) should be accepted. It's the instantiation of f that is problematic. Rule (K-..→-<::-..→') should not be discarded.
I'm happy to use a set-theoretic semantics to justify these rules, but I struggle to compile back and forth from Scala to these examples, and to understand where exactly things go wrong.
The problem with the set-theoretic interpretation for kinds and types is that it's easy to confuse typing and kinding, subtyping and subkinding, because both types and kinds are – in some sense – sets of values. If we use numbers and sets of numbers instead, it's clear that they are very different objects, and there is no confusion between the order on the reals and the subset order. The fact that it's hard to translate back and forth between them is intended, to some degree, because it forces us to be careful about the assumptions and intuitions that hold for the subset order but not necessarily for other orders (in this case, subtyping).
As @sstucki highlighted, we have two issues:
(K-..→-<::-..→')
(note the prime). But it's unclear how to implement that for the current representation of kinds, which is far too liberal.Since the discussion here is intertwined, we'll keep this one thread, but eventually this discussion should produce distinct and issues for the two parts.
@Blaisorblade the notation that you are using in (K-..→-<::-..→') is confusing to me. This is a kinding rule, right? But the notation looks a lot like you're comparing type lambdas. But I guess these are dependent arrow kinds? Maybe we shouldn't overload => even more than it already is...
Yes, [X >: L1 <: U1] => K1
is a dependent arrow kind — that's meant to be forced by the K
there. The gist does the same overloading, with (x:S) → T and (x:S) → K. Which isn't really overloading to me, as shown by PTS syntax.
IUC, to check that the "kind" of
g(x) = 0
is a "subkind" ofg(x ≥ 1) such that -log(x) ≤ g(x) ≤ log(x)
, rule(K-..→-<::-..→')
(note the prime) asks us to checkx ≥ 1 ⊢ 0..0 <:: -log(x)..log(x)
, which is true. IIUC, your example illustrates (K-..→-<::-..→) (no prime); that rule should then be discarded.I'm not sure why you say the rule should be discarded. This part of the example (instantiation of g) should be accepted. It's the instantiation of f that is problematic. Rule (K-..→-<::-..→') should not be discarded.
Sure, g
is fine and f
is not. I just said to discard the primeless subtyping rule (K-..→-<::-..→)
, not the primed subkinding rule (K-..→-<::-..→')
. I think I paid attention to not confuse the two. I was accepting that your example was convincing. Since you didn't state your thesis, I inferred that was an argument against (K-..→-<::-..→)
, which I accepted. But thinking again, you were maybe just arguing against (K-..-<::-..)
at higher kinds.
OK, so back to subtyping lambdas with different bound annotations. You may wonder, why not enforce equal bounds when subtyping lambdas?
HK-DOT has allows different bounds in subtyping of lambdas, thanks to (T-→-<:-→). I see you don't like that, but I don't see yet crystal-clear arguments.
That being said, it's really hard to come up with actual Scala snippets that trigger subtype checking of lambdas but not some form of kind/subkind checking.
Given trait Foo {type T[F <: [X] => Int]}
, then I'd require that Foo.T[G]
must check G <: [X] => Int
— that's what the syntax promises. Hence, lambda subtype checking must support this example.
Arguably, this example can also be checked using kind or subkind checking. Still, the answer from subtyping is the correct one, and I expect that the answers from kinding and subkinding should agree, and/or that we need a guidelines for which to pick. Here, I think the expected kind would be [X] => Int
, so the answer would be equivalent whichever reasonable formulation you pick.
I didn't get your argument here, but it's clearer in your thesis, and I am thinking about it.
The subtlest part: subtyping of lambdas per se
So, I have found the one case where one would use Γ ⊢ T1 <: T2 :: K1
with a K1
not coming from T2
— which seems, really, the crucial part. That's needed to check Γ ⊢ X T1 <: X T2:
Γ ⊢ X :: K0
Γ ⊢ K0 promotes to ∀ X :: K1. K2
Γ ⊢ T1 <: T2 :: K1
-----------------
Γ ⊢ X T1 <: X T2
"Γ ⊢ K0 promotes to ∀ X :: K1. K2" means the same as in TAPL: find the least superkind of K0
that matches the given shape. Dotty's widen...
methods seem all forms of promotion for different patterns.
And in your thesis, that seems the only case where the expected kind K1 is not inferrable from either T1 or T2.
In Scala, when checking that Γ ⊢ F T1 <: F T2
(for some F
, which EDIT must be covariant), we might be too conservative, because T1 <: T2
might be false in general, but true under the context induced by F
. For instance,
[X] => [Y] => X <:
[X] => [Y] => Y` is false in general, but true under the right bounds.
@Blaisorblade, I'm not sure I fully understand your reasoning, but the idea of using type application to trigger subtype checking makes sense to me. Here's a variant (?) of your code that should trigger the same subtype checks as the OP as well as the example involving beta-eta-equality I mentioned earlier.
object LambdaTests {
/**
* This abstract type function takes arguments that are themselves
* type functions with their first arguments bounded by `Int`. In
* other words, it "fixes" the expected kind of its arguments to be
* `(X <: Int) -> *`. Any subtype check of the form
*
* FixA[T1] <: FixA[T2]
*
* should therefore trigger
*
* 1. a kind check T :: (X <: Int) -> * ,
* 2. a subtype check T1 <: T2 at kind (X <: Int) -> * .
*
*/
type FixA[+X[Y <: Int]]
// Similar fixtures for (X <: String) -> * and (X <: Int | String) -> *
type FixB[+X[Y <: String]]
type FixAB[+X[Y <: Int | String]]
// Lower bounds of type declarations from OP
type LA[Y <: Int] = Int
type LB[Y <: String] = String
type LAB[Y <: Int | String] = Int | String // lower bound of L in intersection A & B
type LC[Y <: Any] = Any
// Duplicate definitions of the above for sanity checks.
type LA2[Y <: Int] = Int
type LB2[Y <: String] = String
type LAB2[Y <: Int | String] = Int | String
type LC2[Y <: Any] = Any
/***** TRUE POSITIVES: should pass and do pass *****/
// Sanity checks: types with the same definitions should be equal (all work).
def tARefl(x: FixA[LA2]): FixA[LA] = x
def tBRefl(x: FixB[LB2]): FixB[LB] = x
def tAABRefl(x: FixAB[LA2]): FixAB[LA] = x
def tBABRefl(x: FixAB[LB2]): FixAB[LB] = x
def tACRefl(x: FixA[LC2]): FixA[LC] = x
def tBCRefl(x: FixB[LC2]): FixB[LC] = x
def tABCRefl(x: FixAB[LC2]): FixAB[LC] = x
// Variations of test AL below
def tAL2(x: FixA[[Y <: Int] => Int]): FixA[[Y <: Int] => Int] = x
def tAL3(x: FixA[[Y <: Int] => Int]): FixA[[Y <: Int] => Any] = x
def tAL4(x: FixA[[Y <: Any] => Int]): FixA[[Y <: Int] => Int] = x
def tAL5(x: FixA[[Y <: Any] => Int]): FixA[[Y <: Any] => Int] = x
/***** FALSE NEGATIVES: should pass but don't *****/
// Subtype checks between lower bounds from the OP
def tA(x: FixA[LC]): FixA[LA] = x
def tB(x: FixB[LC]): FixB[LB] = x
def tAB1(x: FixAB[LC]): FixAB[LAB] = x
def tAB2(x: FixAB[LC]): FixAB[LA | LB] = x
// Variant of subtype checks from the OP using lambdas
def tAL(x: FixA[[Y] => Any]): FixA[[Y <: Int] => Int] = x
def tBL(x: FixB[[Y] => Any]): FixB[[Y <: String] => String] = x
def tABL(x: FixAB[[Y] => Any]): FixAB[[Y <: Int | String] => Int | String] = x
// Variations of test AL with contravariant bounds
def tAL6(x: FixA[[Y <: Int] => Int]): FixA[[Y] => Int] = x
def tAL7(x: FixA[[Y <: Int] => Int]): FixA[[Y] => Any] = x
/***** TRUE NEGATIVES: should not pass and don't *****/
// Subtype checks with incompatible fixtures
def tFAB(x: FixA[LB]): FixA[LB] = x
def tFBA(x: FixB[LA]): FixB[LA] = x
def tFABL(x: FixA[[Y <: String] => String]): FixA[[Y <: String] => String] = x
def tFBAL(x: FixB[[Y <: Int] => Int]): FixB[[Y <: Int] => Int] = x
}
In Scala, when checking that Γ ⊢ F T1 <: F T2 (for some F, which EDIT must be covariant), we might be too conservative, because T1 <: T2 might be false in general, but true under the context induced by F.
Interesting, if I may bring variance into the discussion (like things weren't complicated enough already!), this restriction seems to be why we need adaptHkVariances: https://github.com/lampepfl/dotty/blob/3ef991ddbe7db95de2863357f3cc1a34ca95fb6d/compiler/src/dotty/tools/dotc/core/TypeApplications.scala#L292-L323
this restriction seems to be why we need adaptHkVariances.
Interesting! Though it doesn't seem to help when subtyping applications:
trait EtaExpandVariance {
type FixVar[F[X]] // Expect operator with invariant argument
type ListInv[ X] = List[X]
type ListCov[+X] = List[X]
// Both of these should work and do work.
val li: FixVar[ListInv] // type checks
val lc: FixVar[ListCov] // type checks
// These should all work -- eta-expansion should not matter!
val lc1: FixVar[ListCov] = li // fails
val li1: FixVar[ListInv] = lc // fails
val lc2: FixVar[[X] => ListCov[X]] = li // type checks
// Uses X in an invariant position. Should probably not kind check...
val li2: FixVar[[+X] => ListInv[X]] = lc // type checks
}
For completeness: an analogous example with bounds instead of variances.
trait EtaExpandBounds {
type FixBnd[F[X <: Int]]
type ListBounded[X <: Int] = List[X]
type ListUnbounded[X] = List[X]
// Both of these should work and do work.
val lb: FixBnd[ListBounded] // type checks
val lu: FixBnd[ListUnbounded] // type checks
// These should all work -- eta-expansion should not matter!
val lu1: FixBnd[ListUnbounded] = lb // fails
val lb1: FixBnd[ListBounded] = lu // fails
val lu2: FixBnd[[X <: Int] => ListUnbounded[X]] = lb // type checks
// Uses unbounded X where a bound is expected. Should not kind check.
val lb2: FixBnd[[X] => ListBounded[X]] = lu // fails
}
This issue was previously reported in a comment to #1252. I'm re-reporting this as requested by @smarter. (The original issue was fixed, but the issue reported in the comment was not).
Given the code
The compiler reports
The problem is that, when the compiler checks that the bounds of
L
declared inC
conform to those declared inB
, it compares the bound annotations of the type-lambdas that form those bounds. But when checking whether two lambdas are subtypes, their bounds need not be compared, as long as both lambdas have a common kind.Concretely, here we want all the lambdas to have the (wider) kind
(Y <: String) -> *
which is the declared kind ofL
in the parent traitB
:so all the bounds are well-kinded and their kinds conform to the signature in the parent class. Then we need to check that the definitions of the lambdas agree in this kind, which they do:
With @smarter and @Blaisorblade, we have discussed two possible fixes:
Make subtype checking kind driven: when comparing two lambdas, use the expected bounds from the kind when comparing the bodies. This can be summed up in the following kinded subtyping rule.
The advantage of this solution is that it's known to be sound (it has been verified at least for a subset of the type system in my thesis). The drawback is that subtype checking would have to be rewritten to take an additional input: expected kind of the types being compared. That would be a major change.
Find a compatible, common bound by computing the intersection/union of the bounds in the individual lambdas. As a subtyping rule, this would look as follows.
The advantage of this solution is that subtype checking does not have to be redesigned completely, while the drawback is that it requires computing the intersections/unions of higher-kinded types. That can probably be done point-wise (i.e. by pushing intersections/unions into lambdas) but I have not worked out the details nor have I got any sort of soundness proof.