scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.86k stars 1.06k forks source link

Unsound implementation of singleton type matching #9359

Open LPTK opened 4 years ago

LPTK commented 4 years ago

Minimized code

case class Key[T]()(val value: T)

def foo[K[_], A, B](k0: K[A], k1: K[B]): Option[k1.type] = k0 match {
  case k: k1.type => Some(k)
  case _ => None
}

@main def test = {
  val k0 = Key()(42)
  val k1 = Key()("oops")
  foo(k0, k1).foreach(_.value.length)
}

https://scastie.scala-lang.org/2LlP0dwdT7qqFiTTGUSszA

Output

java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
    at main$package$.test$$anonfun$1(main.scala:12)
    at scala.Option.foreach(Option.scala:437)
    at main$package$.test(main.scala:12)
    at test.main(main.scala:9)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sbt.Run.invokeMain(Run.scala:115)
    at sbt.Run.execute$1(Run.scala:79)
    at sbt.Run.$anonfun$runWithLoader$4(Run.scala:92)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at sbt.util.InterfaceUtil$$anon$1.get(InterfaceUtil.scala:10)
    at sbt.TrapExit$App.run(TrapExit.scala:257)
    at java.lang.Thread.run(Thread.java:748)

Expectation

Type error.

odersky commented 4 years ago

Any suggestion how to solve this?

smarter commented 4 years ago

case k: k1.type => Some(k)

Looks like this is compiled to == k1 when it should be compiled to eq k1 (by contrast k0.isInstanceOf[k1] is correctly compiled to k0 eq k1)

LPTK commented 4 years ago

Yes, the compiler generates the wrong check for this code. I don't have time to look into it ATM, but I could give it a go in about a week.

bishabosha commented 4 years ago

K[_] is not a subtype of AnyRef, so there is no eq method, unless we are meant to use the assumption of erased generics? https://github.com/lampepfl/dotty/blob/5cdfd31ef55c6e5e93f938d64bde503bae95975f/compiler/src/dotty/tools/dotc/ast/tpd.scala#L948

smarter commented 4 years ago

Interesting, I hadn't seen that code. I think that logic is erroneous: we can only fallback on == if the type of the value is a subtype of AnyVal, otherwise I guess we need to emit an error that this singleton type test is illegal.

smarter commented 4 years ago

(and indeed, the testcase can be rewritten to def foo[K[_] <: AnyRef, ...] to behave appropriately)

bishabosha commented 4 years ago

actually, Scala 2 has the same behaviour (calls BoxesRunTime.equals) and warns about erasure:

example2.scala:5: warning: abstract type pattern K[B] (the underlying of k1.type) is unchecked since it is eliminated by erasure
    case k: k1.type => Some(k)
              ^
1 warning
smarter commented 4 years ago

Ah indeed, this is related to https://github.com/lampepfl/dotty/issues/8430 where we concluded that we could always emit such warnings like Scala 2 does, so I guess that would technically solve this issue too (but I'm not supper happy with the fact that adding/removing a <: AnyRef bound can silently changes semantics like this).

bishabosha commented 4 years ago

this seems to work fine:

if (tp.widen.derivesFrom(defn.AnyValClass))
  singleton(tp).equal(tree)
else
  tree.ensureConforms(defn.ObjectType).select(defn.Object_eq).appliedTo(singleton(tp).ensureConforms(defn.ObjectType))

and the generated code seems fine:

public <K, A, B> Option<Object> foo(Object k0, Object k1) {
    None$ none$;
    Object object = k0;
    if (object == k1) {
        Object k = object;
        none$ = Some$.MODULE$.apply(k);
    } else {
        none$ = None$.MODULE$;
    }
    return none$;
}
smarter commented 4 years ago

We can't do that, because if you have a primitive (or value class) upcasted to Any, it should not be compared using eq, because the same primitive might be boxed to two different objects.

LPTK commented 4 years ago

Isn't there a way to test, at runtime, whether two Any values have the same identity? For references it would use eq, and for boxed AnyVal values it would use ==.

smarter commented 4 years ago

For primitives, you could hardcode a list, but for non-primitive value classes I don't see a way. As always the answer is: valhalla will solve everything :).

LPTK commented 4 years ago

Note that AnyVal classes which happen to be equal (by equals) should not match with .type patterns, as that would be unsound (they could have differing type arguments/type members). So the current behavior is probably wrong:

scala> class Test(val x: Int) extends AnyVal
// defined class Test

scala> val t = new Test(1)
val t: Test = 1

scala> val u = new Test(1)
val u: Test = 1

scala> t match { case x: u.type => true }
val res0: Boolean = true

Now, AnyVal instances do not have a stable identity, since they create new boxes upon being up-casted to Any. So I think .type tests should simply never return true for AnyVal classes, should they be boxed or not.

smarter commented 4 years ago

So I think .type tests should simply never return true for AnyVal classes, should they be boxed or not.

Not sure I follow, you can't know at runtime whether something is a boxed AnyVal or not, and you can't prevent x eq x from returning true.

smarter commented 4 years ago

Note that AnyVal classes which happen to be equal (by equals) should not match with .type patterns, as that would be unsound (they could have differing type arguments/type members).

For a value class with type parameters you'd need unchecked warnings yeah.

LPTK commented 4 years ago

It would be helpful if AnyVal classes extended some common base class internally. Otherwise, having _: x.type sometimes match x and sometimes not (depending on whether they are boxed separately) for AnyVal class instances seems like the lesser of two evils — the other being unsoundness in the type system.

LPTK commented 4 years ago

For a value class with type parameters you'd need unchecked warnings yeah.

You would not necessarily even see that the base thing has type parameters!

def test(x: Any, y: Any) = x match { case y: x.type => }

(Now, maybe the above could get an unchecked warning. Currently it doesn't raise anything.)

smarter commented 4 years ago

The design of value classes is frozen until valhalla arrives since that will change everything (and it's not like we can change it now with the Scala 2 compat story anyway). We just need unchecked warnings as already decided in https://github.com/lampepfl/dotty/issues/8430.

LPTK commented 4 years ago

To be clear, you're proposing to basically remove type system support for singleton type patterns? (i.e., one can always use @unchecked to remove the warning, but then soundness goes out the window and the compiler stops helping you). Seems like a pity as it has interesting applications (see for instance the end of this message).

smarter commented 4 years ago

I'm not proposing anything but that was the consensus yeah.

abgruszecki commented 4 years ago

So if I understand the issue correctly, it seems that with a pattern like x.type where we don't know that x.type <: AnyRef, we need to choose between:

1) A value which is not an instance of x sometimes matching the pattern (b/c we compare with equals) 2) A value which is an instance of x sometimes not matching the pattern (b/c we compare with eq)

@smarter do I understand correctly that we're effectively forced to pick 1) because of previous value class design? I'm not entirely certain if that choice sits with me well.

Reg. #8430 - I was hoping that we could just emit a warning along the lines of "singleton type patterns are unsound with the common instance reuse trick, add annotation X to assert that you know what you're doing". Given this issue, that may not be enough though.

neko-kai commented 4 years ago

Are opaque types immune from the same issue with value classes? They can have different type parameters while sharing the exact same identity, just as AnyVals do.

LPTK commented 4 years ago

@neko-kai opaque types don't have this problem because they don't box things implicitly — which is what messes with object identity semantics, leading to the paradoxes we're grappling with here.

That said, if because of value classes we decided to keep following the crazy scheme that equals be used to match .type patterns when we don't know that the scrutinee is AnyRef, then we'd make the entire type system is unsound — not limited to value classes.

abgruszecki commented 4 years ago

@LPTK opaque types arguably have an even worse problem precisely because they don't box. Consider:

object O {
  opaque type F[T] = Int
  def mkF[T](i: Int) = i
}

Here neither eq nor equals helps with determining whether two instances of F[T] are in fact the same instance. With value classes at the very least eq is true only if we have the same box instance.

I also wouldn't be so quick to call the value class scheme "crazy" - I meant it when I said that I'm not sure what to prefer between letting wrong values match the pattern and letting good values not match the pattern. Arguably both are unsound, the second will just fail early (b/c of unmatched value error).

LPTK commented 4 years ago

@AleksanderBG

opaque types arguably have an even worse problem precisely because they don't box

I don't think that poses any problem. What makes you think that?

Here neither eq nor equals helps with determining whether two instances of F[T] are in fact the same instance.

At least in principle, we can tell whether two Int instances are the same or not, regardless of whether they're boxed or not (just use equals on a hardcoded list of primitives, as mentioned by @smarter). I don't think opaque types are supposed to hide the fact that instances be the same or not. That's not a capability that appears anywhere in their contract. In fact, their contract is precisely that the underlying values behave like values of the underlying types at runtime (opaque types are only a type system abstraction). This should be sound, because the type parameters of opaque types shouldn't participate in GADT reasoning (they don't, right?).

I also wouldn't be so quick to call the value class scheme "crazy"

What's crazy is that this scheme is not restricted to value classes! The current scheme will call any user-defined equals to determine whether two instances are the same, which is nonsensical:

scala> def foo[A](x: A, y: A) = x match { case _: y.type => 0 }
def foo[A](x: A, y: A): Unit

scala> class T { override def equals(that: Any) = { println("hello!"); true } }
// defined class T

scala> foo(new T, new T)
hello!
val res0: Int = 0
abgruszecki commented 4 years ago

Why wouldn't type parameters of opaque types participate in GADT reasoning? They are no different from any other "phantom" type parameter. I think the precise thing that we have going here is that singleton type patterns which are subtypes of opaque types are simply unsound, and so they can result in unsound GADT constraints.

neko-kai commented 4 years ago

If type parameters of both opaques and AnyVals wouldn't participate in GADT reasoning, would that avoid the issue?

LPTK commented 4 years ago

Why wouldn't type parameters of opaque types participate in GADT reasoning? They are no different from any other "phantom" type parameter.

@AleksanderBG I'm surprised that you say that. Maybe we're misunderstanding each other.

Opaque type parameters are very different from class type parameters (phantom or not). Consider:

class Test {
  type F[T]
}
val test: Test = ...

You obviously cannot treat T as you would a class type parameter in GADT reasoning — that is, for instance, you cannot say that if x: test.F[A] and x: test.F[B] then A =:= B. This is because for example if test.F[T] =:= Any this obviously doesn't hold.

Now, from the type system's point of view, opaque types should behave like abstract type members. (The major difference is that opaque types are erased to their underlying types, whereas abstract type members are erased to Object or their upper bound if any — but this difference only affects the post-erasure type system.)

abgruszecki commented 4 years ago

@LPTK ah right, I see what you're objecting to. Yes, you're correct, opaque type constructors aren't injective, which makes my point moot. I mostly focused on the "not participate in GADT reasoning" bit, because to me that suggests that opaque type constructors are some sort of special case for GADT reasoning, which I don't think they are.

This actually brings us to the point made by @neko-kai - perhaps we should treat value classes as not injective? This'd solve the issue (at least it'd get rid of the unsoundness), and the implementation should be trivial. However, unless I'm thinking about this wrong, I don't think this is the correct thing to do - if VC[T] is a value classes, then we cannot write code that converts VC[A] into VC[B] for arbitrary A, B without using casts, which would mean that VC really is injective. (For contrast, for opaque types we can do the above, since they are simply "local" type aliases.)

LPTK commented 4 years ago

perhaps we should treat value classes as not injective? This'd solve the issue (at least it'd get rid of the unsoundness)

You'd still have to do something about type members, at least. Would that even be sufficient? Who knows? That doesn't seem like a very good solution.

if VC[T] is a value classes, then we cannot write code that converts VC[A] into VC[B] for arbitrary A, B without using casts

Currently we can, because of the bogus _: x.type patterns implementation.

So I think we should disallow _: x.type patterns on types we don't statically know to be subtypes of AnyRef. That's because there's no good implementation for them otherwise, notably because of value classes (and even without value classes, their implementation would be messy, having to hard-code primitive type equality tests).

We can't even make _: x.type patterns always fail on non-AnyRef types either, because it seems there is no way to detect whether something is a boxed AnyVal or a "true" AnyRef value.

abgruszecki commented 4 years ago

@LPTK an opaque type can also be a subtype of AnyRef, no? If we allow singleton typecase patterns for opaque types then we still have unsoundness, because we get a value of type x.type which isn't an instance of x.

I agree that only permitting typecase patterns when we know we're sound is the principled solution. This'd be a compatibility break with Scala 2 though, which simply emitted a warning and compiled such patterns. I'm not certain if this is necessarily a good thing right now.

LPTK commented 4 years ago

@LPTK an opaque type can also be a subtype of AnyRef, no? If we allow singleton typecase patterns for opaque types then we still have unsoundness, because we get a value of type x.type which isn't an instance of x.

What? Look, the situation is really simple:

we get a value of type x.type which isn't an instance of x

First, x is not a type. What does "isn't an instance of x" mean? I imagine what you mean is that we can smuggle opaque type values out of the opaque type into the underlying type by using x.type pattern. Note that this does not represent a source of unsoundness AFAIK. Moreover, this issue really isn't specific to this form of pattern. You can do that with any pattern, really. It's the general issue described in https://github.com/lampepfl/dotty/issues/7314 — in summary, opaque types are a fundamentally leaky abstraction, due to Scala's pattern matching semantics (which can discover new type info at runtime, including type info which was hidden behind an opaque type).

abgruszecki commented 4 years ago

Right, I guess I keep making the mistake of thinking of opaque types like they were newtypes, whereas they really are just type members.

By "we get a value of type x.type which isn't an instance of x" I meant "we get a value of type x.type which isn't the same object instance as x". This kind of thinking doesn't really apply to opaque types though, since they don't add additional layer of object identity on top of the underlying type, just like type members.

bishabosha commented 4 years ago

would it be too intrusive to require some implicit evidence that implements singleton type testing that is only required when the type is upper bound Any?

abgruszecki commented 4 years ago

As a first guess, I'd expect so. Remember that this implicit evidence would need to be passed all the way from where we know the concrete type, which might be separated by multiple methods where the type is abstract (a type parameter).