scala / scala3

The Scala 3 compiler, also known as Dotty.

https://dotty.epfl.ch

Apache License 2.0

5.86k stars 1.06k forks source link

Dotty does not yet implement the Kennedy and Pierce restriction #4805

Closed Blaisorblade closed 6 years ago

Blaisorblade commented 6 years ago

To remove one venue for undecidability, the SLS 5.1.5 mandates a restriction (implemented by scalac), which appears not implemented yet in Dotty. Test tests/untried/neg/t8146-non-finitary.scala and tests/untried/neg/t8146-non-finitary-2.scala trigger a "Recursion limit exceeded" exception, instead of triggering the SLS-mandated error.

The Scalac implementation (which I haven't studied) starts at https://github.com/scala/scala/blob/9df13fd826bdf1c83cc701d1fa16e9ec729ab782/src/reflect/scala/reflect/internal/Types.scala#L1725-L1728.

Blaisorblade commented 6 years ago

Q: Should we implement the restriction or drop it?

"Recursion limit exceeded" is a bad error message and it's often a suspect bug.
it seems the restriction forbids definitions which could compile, but would cause clients to fail. It seems to forbid programs which apparently do compile — one would have to study Grigore's POPL'17 paper (https://dl.acm.org/citation.cfm?id=3009871 / https://arxiv.org/pdf/1605.05274.pdf).

odersky commented 6 years ago

I still have to see someone running into a recursion unintentionally because of this. Does anyone have a case where this happened and you were not trying to trick the compiler intentionally? If not, I would vote for not having a restriction - it's complicated code that might contain bugs,increases the complexity of the compiler, and might rule out sensible programs.

Blaisorblade commented 6 years ago

I haven't searched for such natural examples and I'm not aware of any. @JanBessai pointed out (on Gitter) that each source of undecidability can hide bugs both in the compiler and in analysis tools.

And FWIW, if we decide against this, this should become a spec issue — should I add an area:spec label?

JanBessai commented 6 years ago

I think allowing undecidable subtype checking is a bad idea: Let S <: T be looping, what about S & T <: T and T & S <: T? Here, we'd need to implement non-deterministic (in practice interleaved) search, for otherwise the second one would be accepted, while the first one fails (which would also render intersection types non-commutative again...). This also destroys any hope of implementing type-matching: Suppose S and T are closed types and A is a variable, if S <: T we have A & S <: T with A = Any, otherwise we need a more specific substitution for A. However, if we cannot check S <: T there is no way to find out. From a more practical standpoint: What happens if implicit search encounters a subtype-checking loop? This would be extremely hard to debug. What if S or T is moduleBySomebodyElse.ComplicatedPathDependentType[OurComplicatedType]: Do we have to break the abstraction barrier and look inside of mouduleBySomebodyElse to figure out, why this fails? How is "somebody else" supposed to know and document "don't put OurComplicatedType in there"? Are all (most?) people encountering this going to file issues with the Scala compiler? What if some future version of Scala decides this should be outlawed (e.g. because some optimization turns (S' & T) & T <: T into S & T <: T with S' <: S)? The amount of broken legacy code will grow over time, so it is best to address this as soon as possible. Additionally from a theory standpoint: not at least specifying that S <: T is invalid, will widen the gap between and future formal core of Scala and the true language. Without the restriction to have decidable subtyping, intuitionistic theorem provers (such as Coq) don't let you assume excluded middle S <: T or not (S <: T), preventing it from ever being used in any proof about Scala. Undecidable parts in the compiler also make "CompCert" or "RustBelt"-like projects for Scala much harder (if not impossible).

Blaisorblade commented 6 years ago

@JanBessai Looping types are still errors, but they're detected during compilation dynamically rather than statically.

Unfortunately, subtyping in Scala and Dotty is also undecidable for a bunch of other reasons. See #4385 and #318. That goes back to DOT — type members can be defined using open mutual recursion, just like term members, and nobody knows how to get strong normalization there too (the published state of the art is https://www.cs.purdue.edu/homes/rompf/papers/wang-ecoop17.pdf, I'm actually exploring the topic).

From a more practical standpoint: What happens if implicit search encounters a subtype-checking loop? This would be extremely hard to debug.

4385 turns the resulting stackoverflow into a log of the subtype checks done by the compiler.

What if S or T is moduleBySomebodyElse.ComplicatedPathDependentType[OurComplicatedType]: Do we have to break the abstraction barrier and look inside of mouduleBySomebodyElse to figure out, why this fails? How is "somebody else" supposed to know and document "don't put OurComplicatedType in there"?

Typechecking is still compositional in the appropriate sense — it doesn't depend on the bodies, only on their types — so the definition of ComplicatedPathDependentType is not behind the abstraction barrier—it's part of the interface, just like it isn't with proper dependent types or ML modules. Unless moduleBySomebodyElse.ComplicatedPathDependentType is abstract, but then its definition doesn't affect typechecking the call site, only the bounds (which is again part of the interface).

Without the restriction to have decidable subtyping, intuitionistic theorem provers (such as Coq) don't let you assume excluded middle S <: T or not (S <: T), preventing it from ever being used in any proof about Scala.

That's incorrect both because you can postulate excluded middle in Coq and because existing proofs of type soundness don't require "S <: T or not (S <: T)" — that's exactly decidability of subtyping, but is not required for proofs of soundness.

For the POPLmark challenge, people proved soundness of full F<:, which is undecidable https://www.seas.upenn.edu/~plclub/poplmark/poplmark.pdf. They did it in Coq as well (https://www.seas.upenn.edu/~plclub/poplmark/).
We have proofs of soundness for DOT, which is also undecidable because it can encode full F<: (see https://infoscience.epfl.ch/record/223518 Sec. 4.2, search "decidable").

JanBessai commented 6 years ago

Looping types are still errors, but they're detected during compilation dynamically rather than statically.

The question remains: will S & T <: T and T & S <: T both loop or none of them, or just one of them?

ComplicatedPathDependentType is not behind the abstraction barrier

That maybe a contentious issue in the presence of dependent types and macros, which both compute "behind the scene". Nothing enforces opaqueness (currently). However, libraries like scala-graph and shapeless are getting to the point where the details behind a type definition are not necessarily meant for public consumption.

That's incorrect both because you can postulate excluded middle in Coq and because existing proofs of type soundness don't require "S <: T or not (S <: T)" — that's exactly decidability of subtyping, but is not required for proofs of soundness.

That is not what I claimed. I've only claimed that no proof can ever use it in an intuitionistic setting, which remains true:

Adding an axiom turns Prop (or even worse Type) into classical logic. If you need strong decidability (in Type) anywhere this makes extraction impossible.
Soundness is only one very limited example of things you can proof about a type system. Especially with intersection types, path/organization/arrow/beta lemmas [1 Lemma 2.4, 2 Lemma 10, 3] break if you cannot compare atoms/type constructors of the specified types. Without these most other properties won't work. In current Scala formalizations this is not an issue, because they use a finite set of type constants such as TBool that come with decidable equality (hidden behind Coq's "inversion" tactic)

[1]: Barendregt, Henk, Mario Coppo, and Mariangiola Dezani-Ciancaglini. "A filter lambda model and the completeness of type assignment." The journal of symbolic logic 48.4 (1983): 931-940. [2]: Laurent, Olivier. "Intersection Subtyping with Constructors". ITRS 18 [3]: Bessai, Jan, et al. "Extracting a formally verified Subtyping Algorithm for Intersection Types from Ideals and Filters." Talk at COST Types, slides code (2016).

Blaisorblade commented 6 years ago

First, fixing the OP issue does not make typechecking decidable, and this seems indeed an unlikely way to trigger the problem.

Second, since we can as well finish the side discussion we started here rather than opening a new issue: Overall, Scala users suffer much bigger issues than decidability of typechecking — so worrying about this is a bit like optimizing something that is not the bottleneck. Then, this issue seems a non-bottleneck in a non-bottleneck.

I'm not against decidable typechecking, but that seems hopeless without strong normalizations of types, which is already research for DOT even tho there are no F-Bounds there. If you're interested, we could talk about this elsewhere.

That is not what I claimed. I've only claimed that no proof can ever use it in an intuitionistic setting, which remains true:

Ah, I misparsed "preventing it from ever being used in any proof about Scala", my bad. Still, this is likely to not be the 'bottleneck' for applying many of those theoretical projects to actual Scala. Most properties beyond soundness you might care about are likely false, so you'd need to change the language — for instance, MixML / 1ML / MLSub explore ML with a few ideas from Scala. I keep joking a fully principled Scala-like type system might be 50 years away :-)

Relative to "the theorem is false", having to use the Delay monad (or similar mechanisms) to model a semi-decidable typechecker, though painful, is a smaller problem.

That maybe a contentious issue in the presence of dependent types and macros, which both compute "behind the scene". Nothing enforces opaqueness (currently). However, libraries like scala-graph and shapeless are getting to the point where the details behind a type definition are not necessarily meant for public consumption.

And instead of whitebox macros (which have no meaningful type signature), Dotty is adding typelevel programs to use in their type signatures (in #4616) — that will help adding an abstraction barrier there.

For shapeless types, I see your concern, but I don't see a real abstraction barrier other than types.

The question remains: will S & T <: T and T & S <: T both loop or none of them, or just one of them?

That's implementation-defined — ideally all of those would loop, in practice some won't be detected. I agree this is annoying and unreliable in principle, I haven't seen complaints on this from actual users (including me), as opposed to PL researchers (a category which also includes me), mostly because they don't actually write such looping types, and certainly not in this way.

Blaisorblade commented 6 years ago

Conclusion: for now typechecking is undecidable anyway, and issues are much easier to trigger. This check is also hard to implement. So it seems reasonable to let this unimplemented; if other sources of undecidability are ever fixed, we might want to reconsider.