pattern matching over covariant GADT unsound

scabug commented 10 years ago

The following code produces a ClassCastException at runtime:

object Test extends App {
  sealed trait Node[+A]
  case class L[C,D](f: C => D) extends Node[C => D]

  def test[A,B](n: Node[A => B]): A => B = n match {
    case l: L[c,d] => l.f
  }

  println {
    test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
  }
}

scabug commented 10 years ago

Imported From: https://issues.scala-lang.org/browse/SI-8563?orig=1 Reporter: Owen Healy (ellbur) Affected Versions: 2.11.0 See #7714

scabug commented 10 years ago

@retronym said: Thanks for the report.

I'm think this is the same issue as reported in #7714, pattern matching is unsound if the presence of variant GADTs.

I haven't dug into this to confirm if it is the same bug, so I'll leave this open with a link for now.

scabug commented 10 years ago

@retronym said: I'll also note that unlike #7886, this bug is also exhibited without case classes (ie with extractors)

object Test {
  sealed trait Node[+A]

  // case class L[C,D](f: C => D) extends Node[C => D]

  class L[C, D](val f: C => D) extends Node[C => D]
  object L {
    def unapply[C, D](x$0: Test.L[C,D]): Option[C => D] = ???
  }

  def test[A,B](n: Node[A => B]): A => B = n match {
    case l: L[c,d] => l.f
  }

  def main(args: Array[String]) {
    println {
      test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
    }
  }
}

scabug commented 10 years ago

@retronym said: [~blaisorblade] documented the difficulties with GADT patterm matching here: http://lampwww.epfl.ch/~hmiller/scala2013/resources/pdfs/paper5.pdf

scabug commented 10 years ago

Owen Healy (ellbur) said: The fact that this can be implemented without case class matching but only when allowing ??? is interesting. If you believe (as I do) that pattern matching should ultimately be syntactic sugar for function application, then the following should be equivalent:

object Test4 extends App {
  sealed trait Node[+A] {
    def extract[R](taker: A => R): R
  }
  class L[C,D](f: C => D) extends Node[C => D] {
    def extract[R](taker: (C => D) => R) = taker(f)
  }

  def test[A,B](n: Node[A => B]): A => B = n.extract(f => f)

  println {
    test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
  }
}

But, this produces a compile error for the lack of an appropriate extract method for the anonymous subtype of Node[Nothing].

scabug commented 10 years ago

@Blaisorblade said:

On the original example here

All examples in #7714 are unrelated, other than by "GADT type refinement in patterns is broken".

The first example here has exactly the same unsoundness I found (described not only in the paper, but also mentioned in my comment to SI-6944) — the type refinement inside test is not valid, because new L[Int, Int](identity) with Node[Nothing] is legal (you can allow either this instantiation or test, not both together).

The original author of the pattern matcher (Burak Emir) wrote papers on the topic, but the languages he describes always forbid new L[Int, Int](identity) with Node[Nothing]: since L[Int, Int] already inherits from Node, you cannot inherit from Node again with a more specific type argument. I refer not mainly to "Matching Objects with Patterns" (ECOOP 2007), which ignores covariance, but especially to "Variance and Generalized Constraints for C# Generics" (ECOOP 2006), where he requires "that generic instantiations are uniquely determined". Even though that paper is for C#, that's the closest thing to a formal study of soundness for this kind of constructs.

scabug commented 10 years ago

@Blaisorblade said (edited on May 5, 2014 8:07:15 PM UTC):

On Jason Zaugg's variant

I'm slightly confused on the need for an extractor. On Scala 2.10.3 and 2.11.0, you don't need any extractor to reproduce the bug — a typed match does not use it. (The use of a typed extractor is a slight difference with the example I mentioned before in #6944, but it appears immaterial). To wit:

object Test {
  sealed trait Node[+A]

  // case class L[C,D](f: C => D) extends Node[C => D]

  class L[C, D](val f: C => D) extends Node[C => D]

  def test[A,B](n: Node[A => B]): A => B = n match {
    case l: L[c,d] => l.f
  }

  def main(args: Array[String]) {
    println {
      test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
    }
  }
}

[~ellbur], I think your example is interesting. Technically speaking, that's not at all the desugaring used: a typed match, that is e match {case v: SomeType => ...};, must be desugared (conceptually) using if (e.isInstanceOf[SomeType]) {val v = e.asInstanceOf[SomeType]; ...};.

However, it's true that pattern matching on sum types can usually be understood using elimination forms (your extract method), even for GADTs. (But your translation is equivalent to requiring that Node has a member of type A). In fact, desugaring to elimination forms can be an implementation strategy.

For Scala, I'd be happier with explicit witnesses of type equality, like in GHC's core language. GADT translation just inserts a checkcast after typechecking, but this typechecking produces no certificate that the type is right, unlike GHC does.

IMHO, if GHC hackers feel the need to use such high-assurance technology, one cannot hope that Scalac magically gets this right. But getting this right will take no less than one PhD student, and right now I'm not sure there are the technical foundations to get this started.

scabug commented 10 years ago

Owen Healy (ellbur) said:

But your translation is equivalent to requiring that Node has a member of type A

Yes, very true. In fact, that's how I would intuitively read this GADT definition in the presence of sealed -- A Node[A] must be an L[C,D] for some (C => D) =:= A, and an L[C,D] has a member of type C => D aka A.

To me, it is more surprising that that intuitive reading turns out not to hold.

scabug commented 9 years ago

Arnold deVos (a4dev) said: A smaller example that does not involve covariance but does produce a ClassCastException. This is scala 2.11.5:-

object Test {
  trait B
  case class C[X, Y]( x: X, y: Y, f: X => Y) extends B
  def eval(b: B) = b match { case C(x, y, f) => f(y) }
  eval(C("abc", 0, (x: String) => x.length))
}
Test

java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
  at Test$$anonfun$1.apply(<console>:13)
  at Test$.eval(<console>:11)
  ... 45 elided

scabug commented 9 years ago

@retronym said (edited on Apr 2, 2015 3:19:04 AM UTC): The variance here is in the type parameters of Function1.

scabug commented 9 years ago

Arnold deVos (a4dev) said:

The variance here is in the type parameters of Function1.

I see.

aaronlevin commented 6 years ago

(sorry, wrong issue)

Blaisorblade commented 6 years ago

Fix unlikely in Scala 2 but see lampepfl/dotty#3989 and lampepfl/dotty#4013 for the Dotty plans.

Blaisorblade commented 6 years ago

This might warrant fixing if anybody's interested — the main issue was deciding how to fix the bug. Backporting the fix to Scalac would help figuring out if there are major (and valid) uses for the old behavior.

dwijnand commented 3 years ago

3.0.0:

scala> object Test extends App {
     |   sealed trait Node[+A]
     |   case class L[C,D](f: C => D) extends Node[C => D]
     |
     |   def test[A,B](n: Node[A => B]): A => B = n match {
     |     case l: L[c,d] => l.f
     |   }
     |
     |   println {
     |     test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
     |   }
     | }
     |
10 |    test(new L[Int,Int](identity) with Node[Nothing]: Node[Int => String])(3): String
   |         ^
   |illegal inheritance: anonymous class Test.L[Int, Int] with Test.Node[Nothing] {...} inherits conflicting instances of non-variant base trait Node.
   |
   |  Direct basetype: Test.Node[Nothing]
   |  Basetype via case class L: Test.Node[Int => Int]

scala / bug

pattern matching over covariant GADT unsound #8563

On the original example here

On Jason Zaugg's variant