Open CeylonMigrationBot opened 11 years ago
[@gavinking] In #3686, @RossTate also mentions this simpler one:
interface Inv<P> {}
interface A satisfies Inv<A&Inv<A>> {}
[@gavinking] So I have now got the typechecker rejecting both these cases by disallowing stuff like this:
T satisfies A<B<T>>
while still allowing this:
T satisfies A<T>
i.e. a type can appear as a type argument one-level-deep, but not two-levels-deep.
This might be heavy handed, and it might be insufficient, so the questions are:
T satisfies A<B<T>>
is actually useful?[@gavinking] Rm, I can currently circumvent it like this:
interface In<P> {}
alias Y => In<X>;
interface X satisfies In<X&Y> {}
So I need to handle aliases a bit better, apparently.
[@gavinking] OK, I have an ugly and even heavier-handed way to prevent circumvention via an alias.
[@gavinking] Hrm, so I can still circumvent it as simply as this:
interface In<P> {}
interface Y satisfies In<X> {}
interface X satisfies In<X&Y> {}
[@gavinking] So my new solution to this problem is to disallow:
I know that sounds heavy-handed, but it's something I can actually determine without needing to run the subtyping algorithm or introduce a special new compilation phase. And so far I have not noticed anything useful that these rules preclude.
[@gavinking] In fact, I think I might be able to remove the first restriction if I enforce the second restriction in AliasVisitor
instead of TypeVisitor
. I forgot that we already have a separate phase for dealing with aliases and circularity.
[@gavinking] OK, first restriction is gone, leaving the following as disallowed:
- an intersection as a type argument to an invariant type parameter in a satisfied/extended type
[@RossTate] This still seems to crash things. Note it doesn't have anything invariant anywhere.
interface Co<out P> {}
interface A satisfies Co<B&Co<A>> {}
interface B satisfies Co<A&Co<B>> {}
Co<A&B&Co<A&B>> foo(Co<A&B> co) {return co;}
I'm guessing what it's crashing is your canonicalization algorithm. I still don't believe you need a canonicalization algorithm.
[@gavinking]
This still seems to crash things. Note it doesn't have anything invariant anywhere.
Damn, that sucks :-(
I'm guessing what it's crashing is your canonicalization algorithm.
Yes. So there's two ways I guess I could eliminate this case:
extends
/satisfies
clauses. This seems like it would prevent some useful type definitions, but it doesn't appear to be anything we use in ceylon.language
or the SDK.I still don't believe you need a canonicalization algorithm.
I can't imagine how else I would deal with complex types involving union and intersection. Note that the compiler now leaves types uncanonicalized until it's time to determine if one is a subtype of another. Then at that point it goes ahead and resolves aliases and canonicalizes to a union of intersections.
[@RossTate] Here's another example that's currently broken. Rather than crashing, you reject foo
even though it's valid code. This suggests your canonicalization algorithm is incomplete even when there are no intersections in inheritance clauses at all.
interface A satisfies Co<Co<A>> {}
interface B satisfies Co<Co<B>> {}
A&Co<B&Co<A>> sub(A&Co<B> sub) {return sub;}
A&Co<B> sup(A&Co<B&Co<A>> sup) {return sup;}
Inv<A&Co<B&Co<A>>> foo(Inv<A&Co<B>> foo) {return foo;}
[@RossTate] As for your circularity proposition, I'm currently doing some research on that. There are definitely practical uses of circularity, so you can't just disallow it, but I'm looking into options. Those options might address the concerns here, but it's a big change enough change that this problem alone doesn't call for it even if it works out.
[@gavinking] What is the definition of Co
? Same as above?
[@RossTate] Yep. Co
for covariant and Inv
for invariant. Sorry, I got lazy.
[@gavinking]
you reject
foo
even though it's valid code.
I don't follow: why is A&Co<B&Co<A>>
equivalent to A&Co<B>
? What am I missing?
[@RossTate] sub
and sup
type check, illustrating that the two are equivalent. What you're missing is that A
is equivalent to A&Co<Co<A>>
, so A&Co<B>
is equivalent to A&Co<Co<A>>&Co<B>
, which is equivalent to A&Co<Co<A>&B>
and consequently A&Co<B&Co<A>>
.
[@gavinking] OK, got it. So ProducedType.isExactly()
is broken. Indeed, it seems to be broken in a rather nasty way where I might need a whole new algorithm. Apparently, my canonicalizations are less unique than I thought.
[@gavinking] I guess I should just replace x.isExactly(y)
with x.isSubtypeOf(y)&&y.isSubtypeOf(x)
. That removes some code, at least!
[@gavinking] Well, doing that got me a stack overflow, which I'll have to try and figure out...
[@RossTate] Depending on the system, using double subtyping versus some specialized canonicalization algorithm can introduce non-termination. Remember that contravariance is the source of problems for non-termination in subtyping because it makes the types being compared flip sides, which double subtyping will do as well.
[@gavinking] So even after eliminating the test that caused the stack overflow, the tests ran like 10x slower with this change. So definitely not the right path. So what's a correct implementation of isExactly()
?
[@gavinking] Is there some way we can recognize that A&Co<B&Co<A>>
can be canonicalized to A&Co<B>
?
[@gavinking] I suppose we could simply recognize that A
is an invariant subtype of Co<Co<A>
, and then do B&Co<A>~Co<A>
to arrive at A&Co<B>
. Sounds conceptually possible.
[@RossTate] I really don't know. And remember, this is an "easy" case; it doesn't use intersections or invariance in inheritance, nor union types anywhere. That's why I'd much rather just not allow the type at all. It'd help me if I better understood why you don't want to restrict types. Is it for implementation reasons? In which case, what goes wrong? Or is it for design principles? Or is it because there's some use-case you think would want to use these types?
[@gavinking] OK, so clarifying something. The issue here is that under the principal instantiation inheritance axiom, the following identities hold:
Co<X>&Co<Y> == Co<X&Y>
Contra<X>&Contra<Y> == Contra<X|Y>
These aren't otherwise valid, and canonicalization doesn't take them into account.
[@RossTate] Right. So if A extends Co<A>
and B extends Co<B>
, then Co<A&B>
is redundant in the intersection A&B&Co<A&B>
even though it is not a supertype of either A
or B
.
[@gavinking] OK, that particular problem is now solved, and we're back to the question of nontermination.
[@gavinking] P.S. FTR, the fix for (this)(#3702#issuecomment-21301325) was quite simple in the end. A little adjustment to isExactly()
, and not related to the main topic of this issue.
[@gavinking] OK, great, so I now have a restriction that detects all the problem cases given so far. Basically, I noticed that they all have one thing in common:
So that's what I'm now disallowing. The nice thing is that it's not overly-heavy-handed.
[@gavinking] In fact, the problem does not seem to be limited to intersections. I can also get nontermination for unions, for example:
interface Inv<P> {}
interface A satisfies Inv<A|Inv<A>> {}
So I've extended the restriction to also cover unions.
[@gavinking] Unfortunately, extending the restriction to union types prevents the following slightly-nice pattern:
interface A satisfies Comparable<C|A> {}
interface C satisfies Comparable<C|A> {}
However, I'm certain we can live without this.
[@RossTate] I don't understand what your fix is. Can you explain how it solves or prevents this example:
interface A satisfies Co<Co<A>> {}
interface B satisfies Co<Co<B>> {}
A&Co<B&Co<A>> sub(A&Co<B> sub) {return sub;}
A&Co<B> sup(A&Co<B&Co<A>> sup) {return sup;}
Inv<A&Co<B&Co<A>>> foo(Inv<A&Co<B>> foo) {return foo;}
P.S. Good find with A extends Inv<A|Inv<A>>
!
[@gavinking]
@RossTate The issue is that you've mixed together two issues in this one issue. I fixed the bug in isExactly()
that was rejecting that code, and I have added the following test, which is accepted by the typechecker:
interface Co<out P> {}
interface Contra<in P> {}
interface Inv<P> {}
interface A satisfies Co<Co<A>> {}
interface B satisfies Co<Co<B>> {}
A&Co<B&Co<A>> sub1(A&Co<B> sub) => sub;
A&Co<B> sup1(A&Co<B&Co<A>> sup) => sup;
Inv<A&Co<B&Co<A>>> foo1(Inv<A&Co<B>> foo) => foo;
interface X satisfies Contra<X> {}
interface Y satisfies Contra<Y> {}
X&Contra<Y|X> sub2(X&Contra<Y> sub) => sub;
X&Contra<Y> sup2(X&Contra<Y|X> sup) => sup;
Inv<X&Contra<Y|X>> foo2(Inv<X&Contra<Y>> foo) => foo;
But the main thrust of this discussion has been about decidability. And I now speculate that I also have that problem licked! Of course, without a formal proof, it's a bit silly for me to go making grand claims like that, but I feel good about the new set of restrictions.
[@gavinking] As far as decidability goes, the basic idea is to not allow any kind of type that in any way depends upon the type being defined inside a union or intersection in any of the types it inherits.
[@gavinking] i.e. interface A satisfies Inv<A&Inv<A>> {}
is rejected because A
occurs inside an intersection. Likewise, interface A satisfies Co<B&Co<A>> {}
is rejected because A
occurs as a type argument inside an intersection. Even this:
interface A satisfies Inv<B|Inv<B>> {}
interface B satisfies A {}
must be rejected, because the type B
, which inherits A
occurs inside a union in a supertype of A
.
All the undecidable examples we've seen so far are of that form.
[@RossTate] Okay, so your restriction is to disallow types defined in terms of the class being defined from being used in intersections and unions in the inheritance clause. That may work, though it's hard to say.
How did you use that to get foo1
and foo2
to work, though?
[@gavinking]
Okay, so your restriction is to disallow types defined in terms of the class being defined from being used in intersections and unions in the inheritance clause. That may work, though it's hard to say.
Wellyes, we can't know w/o a formal proof or a counter example. But I feel good about this overall approach.
How did you use that to get
foo1
andfoo2
to work, though?
Well, I just directly use the fact that if I have types of form A&T<X>
and A&T<Y>
and if both types are invariant subtypes of T<Z>
for some Z
then the types are equivalent. It's slightly more involved than that but not much.
[@gavinking] I'm closing this issue, since the family of bugs encompassed by the initial report seems to be fixed. Further discussion belongs in #3686.
[@FroMage] This now disallows the following:
interface Bug866_Visitable<Visited> of Visited satisfies Iterable<Integer|Visited>
given Visited satisfies Bug866_Visitable<Visited> {
}
Which I wrote during some demo/hands-on. Seems a pity.
[@FroMage] Also disallows:
abstract class Real() satisfies Numeric<Real>&Scalable<Int|Real,Real> {}
[@RossTate] I don't understand your solution. What is an invariant subtype?
As for @FroMage's concerns, I do believe your restriction might be overkill.
[@gavinking]
I don't understand your solution. What is an invariant subtype?
Spec section 3.3. I believe you use the language "the principal instantiation of T
that is a supertype of ...".
As for @FroMage's concerns, I do believe your restriction might be overkill.
Quite likely, but nobody else has proposed a set of restrictions that eliminate the problematic cases and can actually be easily enforced by the compiler.
[@gavinking]
Also disallows:
abstract class Real() satisfies Numeric<Real>&Scalable<Int|Real,Real> {}
I might be able to loosen the restriction to let this one slip through. It should be possible to recognize that intersections like Int|Real
are safe since Int
and Real
are both classes and don't involve type parameters.
[@gavinking] Indeed, all the examples we've seen involve something of form:
A&C<B>
or:
A|C<B>
In satisfies
or extends
.
It seems likely that unions and intersections of form A&B
or A|B
(no type arguments anywhere) can't result in nontermination, since the canonicalization algorithm can quickly determine that any such type is already in canonical form.
If that's right, there would be no reason to reject either of @FroMage's examples.
[@gavinking] Ah, excellent, I just managed to construct a new nonterminating example, which suggests I understand problem a lot better now:
interface S<out P> {}
interface T<out P> satisfies S<P> {}
class C() satisfies S<C>&T<S<C>> {}
Since the compiler needs to construct a principal instantiation of S
for C
, it's as if we had written:
class C() satisfies S<C&S<C>> {}
which is a case we already know is problematic.
So the question now is, how can I recognize cases like the above as problematic, while still allowing stuff like this:
class C() satisfies Numeric<C>&Comparable<C> {}
The problem seems to arise when the "recursive" type (in this case C
) occurs at different depths on either side of the intersection/union. But of course I can hide the depth mismatch like so:
interface S<out P> {}
interface T<out P> satisfies S<S<P>> {}
class C() satisfies S<C>&T<C> {}
So what's the right way to distinguish S<C>&T<C>
from Numeric<C>&Comparable<C>
? Well, the difference is simply that Numeric
and Comparable
have no common parameterized supertype, so the need to form a principal instantiation of the supertype never arises.
So I think maybe we wind up with the rule that:
Within the intersection of all direct supertypes of a type
T
, we cannot have any intersection or union of formA&B
orA|B
where bothA
andB
inherit a common parameterized supertypeX
, and bothA
andB
somehow involveT
.
Where "X somehow involves Y" means X
is a type that inherits either Y
or a type that somehow involves Y
or X
is a type expression with a type argument that somehow involves Y
.
[@gavinking]
It seems likely that unions and intersections of form
A&B
orA|B
(no type arguments anywhere) can't result in nontermination
This is nonsense on stilts, please ignore. Counterexample:
interface S<out P> {}
interface A satisfies S<S<C>> {}
interface B satisfies S<C> {}
class C() satisfies A&B {}
That is, C
is sufficient to crash the typechecker, even given its totally harmless-looking satisfies
clause, since A
and B
inherit a common supertype S
, and are both defined in terms of C
.
[@gavinking] A cleaner and less heavy-handed statement of the rule might be something like this:
Within the intersection of all direct supertypes of a type
T
, we cannot have any intersection or union of formA&B
orA|B
whereA
is a subtype ofT<AA>
andB
is a subtype ofT<BB>
for some common parameterized typeX
, and bothAA
andBB
somehow involveT
.
The problem with this formulation is that to determine every AA
such that "A
is a subtype of T<AA>
" without being able to directly form the principal instantiation of T
for A
is potentially difficult.
[@gavinking] No, these formulations are broken, since they reject stuff like this:
satisfies Scalar<Integer> &
Integral<Integer> &
Binary<Integer> &
Exponentiable<Integer,Integer>
Dammit!
[@gavinking] I have essentially given up on trying to develop formal rules for this one. I have not been able to discover a set of rules that disallows all the known problematic cases without also disallowing stuff that is needed in the language module. Instead, I'm just directly detecting the undecidability by attempting to canonicalize the intersection of all supertypes of the type, and detecting the case of nontermination. Nasty solution, but at this point it's all we have.
[@gavinking] Last night @RossTate proposed a potential solution that involves distinguishing types with self types from other parameterized types. In an attempt to translate his idea into something that makes sense in Ceylon's type system, I wound up with the following:
Note that this is not precisely @RossTate's way of expressing the idea, so any holes in it are mine, not Ross'.
The good news about this is that it accepts all the types we currently define in the language module, after minor changes, and rejects all the previously known undecidable types. The bad news is that I was able to construct an undecidable example, by taking advantage of a little intentionally-engineered hole in it:
interface Inv<O, out P> of O {}
interface A satisfies Inv<A,A&B> {}
interface B satisfies Inv<B,B> {}
The hole is the additional parameter, which is not a self type. Now, I can certainly close this hole easily enough, by reformulating the first rule:
(This was indeed my initial formulation.)
Unfortunately, this formulation rejects a couple of types in the language module:
Exponentiable
, which is no big deal, and ConstrainedAnnotation
, which bothers me.So the question now is, instead of tightening 1, is there a way to tighten 2 in order to eliminate the undecidable cases. There are two interesting features of the example above. In order to produce nontermination, I needed to:
Inv
invariant, andInv
.I have not yet found a way to produce nontermination for a type with a self type where all type parameters are invariant or covariant, or where no intersection occurs in an argument to a parameter of a type with a self type. So I hypothesize that the following ruleset results in termination:
Now, I'm pretty much pulling rule 3 out of my ass here. The question is: is there a reasonable conceptual foundation for it. Do the above rules really eliminate all the nonterminating cases?
[@gavinking] Spotted by @RossTate, this crashes the typechecker:
[Migrated from ceylon/ceylon-spec#596]