scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.79k stars 1.04k forks source link

No warning on uncheckable isInstanceOf call #11834

Open abgruszecki opened 3 years ago

abgruszecki commented 3 years ago

Minimized code

import scala.Option
object Test extends App {
  trait A[+X]
  class B[+X](val x: X) extends A[X]
  object B {
    def unapply[X](b: B[X]): Option[X] = Some(b.x)
  }

  class C[+X](x: Any) extends B[Any](x) with A[X]
  def f(a: A[Int]): Int =
    if a.isInstanceOf[B[Int]] then
      a.asInstanceOf[B[Int]].x
    else
      0
  f(new C[Int]("foo"))
}

Output

Compiles without warning, explodes at runtime.

Expectation

A warning about the .isInstanceOf test should be emitted.

Adapted from https://github.com/lampepfl/dotty/blob/master/tests/neg/i3989c.scala#L1.

/cc @liufengyun it seems that additional tweaking of how GADTs are used for checking .isInstanceOf is necessary.

liufengyun commented 3 years ago

@abgruszecki Nice counter-example. I'm wondering whether the compiler should check that C[X] has two different instantiations of the parent: A[Any] and A[X].

liufengyun commented 3 years ago

I'm wondering whether the compiler should check that C[X] has two different instantiations of the parent: A[Any] and A[X].

I think one justification for this check is the locality of reasoning. If the code only involves A and B, we need to be able to reason about their types based on the definition of just A and B. New definitions should not change the relationship between A and B. Otherwise, the rules will be difficult to understand and compromise usability.

abgruszecki commented 3 years ago

I think we're perfectly capable of local reasoning - we just cannot assume that if a value is of types A[Int] and B[X] for some X, then it must be of the type B[Int]. If we translate the classes to structural types like in our short paper then this becomes clear:

type structA = { type A_X }
// A[X] translated to structA & { type A_X <: X }
type structB = { type B_X; type A_X <: this.B_X }
// B[X] translates analogously to A[X]
// note that forall X, (translated B[X]) <: (translated A[X])
liufengyun commented 3 years ago

From the perspective of soundness, there can be many designs that make sense. However, IMHO, the design with the check is more intuitive and usable than others in practice.

liufengyun commented 3 years ago

we just cannot assume that if a value is of types A[Int] and B[X] for some X, then it must be of the type B[Int].

Here it depends on how we read the contract below:

trait A[+X]
class B[+X](val x: X) extends A[X]

If we think from programmer's perspective, we may think the encoding captures only part of the contract. A possible enhancement of the contract is the following:

abgruszecki commented 3 years ago

If you're suggesting to forbid definitions like C then I'm not sure if it can be easily done. If we take a look at the original issue https://github.com/lampepfl/dotty/issues/3989, then the last time this was discussed it was concluded that the loss of expressivity from forbidding those definitions was untenable, in particular because standard library depended on being able to create such definitions.

LPTK commented 3 years ago

Here it depends on how we read the contract below:

trait A[+X]
class B[+X](val x: X) extends A[X]

It's easy to misread this when the types are named the same. The issue becomes more apparent if you use different names:

trait A[+X]
class B[+Y](val x: Y) extends A[Y]

Now we can see there is no reason that B's Y should be the same as A's X.

A possible solution I've always found attractive would be to bring back type class parameters, which are analogous to val class parameters in that they make the value/type public and visible from other classes.

trait A[type +X]
class B[type +X](val x: X) extends A[X]

Above, A now has a type member X which represents the "true" underlying type argument, so that for all a: A[T], we have a.X <: T. The same goes for B, so both must now denote the same type.

dwijnand commented 2 years ago

bring back type class parameters, which are analogous to val class parameters in that they make the value/type public and visible from other classes.

Would that give all type members the ability to declare their variance, or only type class parameters?

If we think from programmer's perspective, we may think the encoding captures only part of the contract. A possible enhancement of the contract is the following:

  • forall X, B[X] <: A[X]
  • forall x, X, x: A[X] and x: B[?] , then x: B[X].

I assume that today the user could (or should) be able to enforce that X doesn't get multiple instantiations by making B final or, harder, make A sealed (and reason about the known children).

Also, are those two enhancements different? I couldn't think of an example that satisfies one and not the other...

abgruszecki commented 2 years ago

bring back type class parameters, which are analogous to val class parameters in that they make the value/type public and visible from other classes.

Would that give all type members the ability to declare their variance, or only type class parameters?

IIRC, variance of type here affects meaning of applying the appropriate parameter, I'm not sure if variance annotation on normal type parameters makes sense.

If we think from programmer's perspective, we may think the encoding captures only part of the contract. A possible enhancement of the contract is the following:

  • forall X, B[X] <: A[X]
  • forall x, X, x: A[X] and x: B[?] , then x: B[X].

I assume that today the user could (or should) be able to enforce that X doesn't get multiple instantiations by making B final or, harder, make A sealed (and reason about the known children).

Also, are those two enhancements different? I couldn't think of an example that satisfies one and not the other...

You mean Fengyun's and Lionel's suggestion? Lionel is arguing for something similar to what Fengyun is suggesting, but with the ability to opt into the behaviour. There's also the matter that Lionel is proposing to make type parameters with type actual type members. So there's some differences between the two.

LPTK commented 2 years ago

@dwijnand

Would that give all type members the ability to declare their variance, or only type class parameters?

No. In my proposal, just like in current Scala, a variance annotation affects the variance of the type constructor, that's it. Type members have no business being variant as far as I can tell. If you disagree, would you be able to describe a reasonable semantics for variant members, as well as interesting use cases for them?

dwijnand commented 2 years ago

You mean Fengyun's and Lionel's suggestion?

No (and it's more of an aside) I meant between Fengyun's two "forall" examples - I thought they must intended different things but I couldn't think of a differing example.

Type members have no business being variant as far as I can tell. If you disagree, would you be able to describe a reasonable semantics for variant members, as well as interesting use cases for them?

I don't know. 😄 There was a desire to unify type parameters with type members, with type parameters desugaring to type members, but that leaves nowhere for variance to be defined, right? And then that project in Dotty was aborted and I haven't found out why. With DOT only having type member, I guess we never model variance in any of the soundness proofs? Do any of the DOT extensions add variance?

LPTK commented 2 years ago

Yeah you are right that at some point, IIRC, Dotty had a concept of type parameters desugaring into type members which had variance associated to them. But I think that was a conceptual mistake, or at the very least a conflation of unrelated concepts.

With DOT only having type member, I guess we never model variance in any of the soundness proofs? Do any of the DOT extensions add variance?

I think declaration-site variance of type constructors is not essential, but more of a derived concept. It can be even defined without reference to an "intrinsic" concept of type parameters. In DOT, we'd consider a definition class C[+A] to simply be defining a type C <: { type A } with type member A along with a notationC[T] which desugars to C & { type A <: T}. Notice that the type member A has no concept of variance, and that other references to C such as C & { type A = T} and even C & { type A >: T} still make sense. So for simple cases like this, declaration-site variance is just notational convenience. (It becomes more tricky when higher-kinded types and high-order subtyping is involved, though.)

dwijnand commented 2 years ago

Huh, thanks! I guess that also does away with having to model all the variance checks that class type-checking does, making sure variance is honoured in type parameter usage in its methods.

LPTK commented 2 years ago

Sure, pushing this approach to its logical conclusion, we end up basically treating variance as a thin layer of syntax sugar over Java-style use-site variance (where there are no restrictions on type parameter use). It's helpful to boil things down this way to understand the relation of variance to the underlying theory. But proper declaration-site variance is still useful, from a user point of view 🙂

odersky commented 2 years ago

I wonder whether we can instead flag this line as an error.

  class C[+X](x: Any) extends B[Any](x) with A[X]

The principle would be that we don't allow variance reversal where we have parents A[X], B[Y], with B a subclass of A but X a subtype of Y.

Not sure about the precise rule, but it seems that it's a lot more reasonable to continue to support reasoning about the pattern matches than to continue supporting such inheritance patterns.

dwijnand commented 2 years ago

That's mentioned in:

If you're suggesting to forbid definitions like C then I'm not sure if it can be easily done. If we take a look at the original issue #3989, then the last time this was discussed it was concluded that the loss of expressivity from forbidding those definitions was untenable, in particular because standard library depended on being able to create such definitions.

Which I think refers back to your comment in https://github.com/scala/bug/issues/7093#issuecomment-292418481, which I think might refer to <2.13 collections? So perhaps we can do this now, which would make this simpler to think about.

smarter commented 2 years ago

I wonder whether we can instead flag this line as an error.

Note that we already emit an error if B is a case class:

trait A[+X]
case class B[X](x: X) extends A[X]
class C[X](x: Any) extends B[Any](x) with A[X]
//    ^
//illegal inheritance: class C inherits conflicting instances of non-variant base trait A.
//
//  Direct basetype: A[X]
//  Basetype via case class B: A[Any]

This is implemented in https://github.com/lampepfl/dotty/blob/30ce9a5256494d60ae7249a004a18e03e433faf1/compiler/src/dotty/tools/dotc/typer/RefChecks.scala#L776-L785 A similar check is https://github.com/lampepfl/dotty/blob/30ce9a5256494d60ae7249a004a18e03e433faf1/compiler/src/dotty/tools/dotc/typer/RefChecks.scala#L798-L814

These were both implemented in https://github.com/lampepfl/dotty/pull/4013

abgruszecki commented 2 years ago

This is something that came up when we were thinking on how to encode class type parameters as type members for GADT purposes. It feels like there were cases of code that actually depended on being able to define classes like C here:

  class C[+X](x: Any) extends B[Any](x) with A[X]

But I cannot recall any specific examples. From the perspective of type members, the way I understand A, B, is:

type A[+X] <: { type X_A <: X }
type B[X] <: A[X] & { type X_B = X }
type C[X] <: B[Any] & A[X]

Take a careful look at how B behaves -- it just extends A[X], but doesn't really put any requirements on X_A. I.e. for any B[X], we know it's also an A[X], but nothing more. By contrast, the way I understand the case class version of B is:

type AltB[X] <: A[X] & { type X_B = X ; type X_A = X_B }
type AltC[X] <: A[X] & AltB[Any]

This alternative version of B now forces a specific relationship between X_B and X_A. Thanks to that, AltC is malformed -- it cannot possibly be inhabited. The way AltB is defined tells us that if we have a value of type AltB[X] & A[Y], then actually X and Y are the same. EDIT: no, that's not true, actually all we know is that X <: Y.

So I don't know if this is an argument for doing things one way or another, but at least in DOT, we have a way of expressing both versions of B.

dwijnand commented 2 years ago

I've found an example where the current implementation of variantInheritanceProblems seems to be too strict (after making it apply to non-case classes). The example I have come from the strawman collections, which I've minimised to:

trait A[+X]
trait B extends A[B]
class C extends B with A[C]

In the original this is Iterable, Seq and List, with List refining that it not only rebuilds from Seqs but from Lists specifically. I'm struggling with the implementation to find a way to make it allow Seq -> List, but disallow X -> Any & X. Perhaps it's just not doable with baseType and I need to look to do something like paramInfoAsSeenFrom? Any tips or advice welcome.

odersky commented 2 years ago

Perhaps it's just not doable with baseType and I need to look to do something like paramInfoAsSeenFrom? Any tips or advice welcome.

I think we are all still scratching our heads way too much to be able to give advice on specifics like that. 🤯

LPTK commented 2 years ago

I'm struggling with the implementation to find a way to make it allow Seq -> List, but disallow X -> Any & X.

Why do you want to do that? FWIW, the type Any & X is equivalent to X, so no principled solution would likely work here, and it's probably not a good idea.

dwijnand commented 2 years ago

Thanks. I think I found what I'm looking for: for a covariant type parameter I want to allow new instantiations that are subtypes, so from <: Seq to <: List, but not from <: T to <: String. I mean I found the thing to look for, now I just need to figure out how to find it, in the API 🙂

 trait A[+X]
 class B[+Y](val value: Y) extends A[Y]
 class C[+Z](str: String)  extends B[String](str) with A[Z]

 // A[T] = A { type X <: T }
 // B[T] = B { type X <: Y; type Y <: T }
 // C[T] = C { type X <: (Y & Z); type Y <: String; type Z <: T }
 //
 // C[T] ~> B[String] ~> A[String] ~> String // via middle=B
 // C[T] ~> A[T] ~> T                        // via base=A
 // T <: String = false

 trait Iter[+Coll]
 trait Seq  extends Iter[Seq]
 class List extends Seq with Iter[List]

 // Iter[T] = Iter { type Coll <: T   }
 // Seq     = Seq  { type Coll <: Seq }
 // List    = List { type Coll <: (Seq & List) }
 //
 // List ~> Seq ~> Iter[Seq] ~> Seq // via middle=Seq
 // List ~> Iter[List] ~> List      // via base=Iter
 // List <: Seq = true

Edit: Nevermind... it's not B[String] ~> A[String] it's B[String] ~> A[String & T] which doesn't work..

dwijnand commented 2 years ago

@abgruszecki discovered that Kotlin also requires "invariant refinements", in the following gist covariant type parameter T ("out" means covariant) is refined from Any to Int, and that is a compilation error:

https://play.kotlinlang.org/#eyJ2ZXJzaW9uIjoiMS42LjIxIiwicGxhdGZvcm0iOiJqYXZhIiwiYXJncyI6IiIsIm5vbmVNYXJrZXJzIjp0cnVlLCJ0aGVtZSI6ImlkZWEiLCJjb2RlIjoiLyoqXG4gKiBZb3UgY2FuIGVkaXQsIHJ1biwgYW5kIHNoYXJlIHRoaXMgY29kZS5cbiAqIHBsYXkua290bGlubGFuZy5vcmdcbiAqL1xuaW50ZXJmYWNlIEJhc2UgPG91dCBUPjtcbmludGVyZmFjZSBCYXNlQW55IDogQmFzZTxBbnk+O1xuaW50ZXJmYWNlIEJhc2VJbnQgOiBCYXNlPEludD47XG5vcGVuIGNsYXNzIERlcml2ZWQxKCkgOiBCYXNlIDxBbnk+O1xuY2xhc3MgRGVyaXZlZDIoKSA6IERlcml2ZWQxKCkgLCBCYXNlSW50OyJ9

LPTK commented 2 years ago

Yes you are right! It's something we've actually just noticed and mentioned in the introductory material of our latest paper (in submission – let me know if you want me to send you a preprint).

dwijnand commented 2 years ago

Gladly, thank you.