scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.88k stars 1.06k forks source link

Scala Wart: Conflating total destructuring with partial pattern-matching #2578

Closed lihaoyi closed 5 years ago

lihaoyi commented 7 years ago

Opening this issue, as suggested by Martin, to provide a place to discuss the individual warts brought up in the blog post Warts of the Scala Programming Language and the possibility of mitigating/fixing them in Dotty (and perhaps later in Scala 2.x). These are based on Scala 2.x behavior, which I understand Dotty follows closely, apologies in advance if it has already been fixed


The following example, also from Lincon Atkinson's blog, compiles without warning and fails with an exception at runtime:

@ val (a, b, c) = if (true) "bar" else Some(10)
scala.MatchError: bar (of class java.lang.String)
  $sess.cmd105$.<init>(cmd105.sc:1)
  $sess.cmd105$.<clinit>(cmd105.sc:-1)

The basic problem here is that when Scala sees val (a, b, c) = ..., it doesn't mean which of two things you mean:

  1. Help me extract the values from ..., and help me check that it's a tuple
  2. Help me extract the values from ..., and fail at runtime if it is not a tuple

Currently, it assumes the latter, in all cases.

That makes any sort of "destructuring assignment" unchecked, and thus extremely unsafe.

The above example at least happily fails with an exception, but the following exhibits the same problem, but instead truncates your data silently, losing the 5:

@ for((a, b) <- Seq(1 -> 2, 3 -> 4, 5)) yield a + " " +  b
res107: Seq[String] = List("1 2", "3 4")

Though the following also fails with an exception:

@ Seq(1 -> 2, 3 -> 4, 5).map{case (a, b) => a + " " + b}
scala.MatchError: 5 (of class java.lang.Integer)
  $sess.cmd108$.$anonfun$res108$1(cmd108.sc:1)
  scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
  scala.collection.immutable.List.foreach(List.scala:389)
  scala.collection.TraversableLike.map(TraversableLike.scala:234)
  scala.collection.TraversableLike.map$(TraversableLike.scala:227)
  scala.collection.immutable.List.map(List.scala:295)
  $sess.cmd108$.<init>(cmd108.sc:1)
  $sess.cmd108$.<clinit>(cmd108.sc:-1)

While interpretation #2 makes sense in match blocks and partial-functions, where you expect to "fall through" to the next handler if it doesn't match, it doesn't make much sense in cases like this where there is nowhere to fall through to.

The correct solution would look something like this:

A possible syntax might be using case, which Scala developers already associate with partial functions and pattern matches:

for(case (a, b) <- Seq(1 -> 2, 3 -> 4, 5)) yield a + " " +  b

case val (a, b, c) = if (true) "bar" else Some(10)

This would indicate that you want to perform an "partial fail at runtime" match, and the earlier non-case examples:

for((a, b) <- Seq(1 -> 2, 3 -> 4, 5)) yield a + " " +  b

val (a, b, c) = if (true) "bar" else Some(10)

Could then verify that the pattern match is complete, otherwise fail at compile time.

PostScript:

I noticed here that the Scala language spec already has words in it that talk about irrefutable patterns, which seem to match what I want these for-comprehension and val-destructuring cases to require. Whether those words mean anything, I do not actually know

Yawar Amin has noted that in the Rust language, the cases which closely mirror those in Scala (let destructuring, and while-loop/if destructuring) do have a requirement of the destructuring patterns being irrefutable https://doc.rust-lang.org/beta/book/second-edition/ch18-02-refutability.html

Here's a 2.12 failure mode in for-comprehensions that's nonsensical, and is fundamentally caused by this issue:

lihaoyi ~$ amm
Loading...
Welcome to the Ammonite Repl 0.9.7
(Scala 2.12.2 Java 1.8.0_112)
If you like Ammonite, please support our development at www.patreon.com/lihaoyi
@ for((a, b) <- Right((1, 2))) yield a + b
cmd0.sc:1: value withFilter is not a member of scala.util.Right[Nothing,(Int, Int)]
val res0 = for((a, b) <- Right((1, 2))) yield a + b
                              ^
cmd0.sc:1: type mismatch;
 found   : Any
 required: String
val res0 = for((a, b) <- Right((1, 2))) yield a + b
                                                  ^
Compilation Failed
odersky commented 7 years ago

Thanks for all the issues! Very useful to have them here. We are currently super-busy preparing for Scala Days Copenhagen, but will get back to them, hopefully soon after. Of course if people want to jump in discussing them now, please do so!

lihaoyi commented 7 years ago

This turned up in the scala/contributors gitter

scala> val 2 = 1
scala.MatchError: 1 (of class java.lang.Integer)
  ... 32 elided
som-snytt commented 6 years ago

The canonical form is val 1 = 2.

jdegoes commented 5 years ago

In my opinion, destructuring assignment should always be irrefutable. The compiler knows, or at least could know, that many things users try to do are type unsafe and cannot possibly succeed at runtime. This leads to bugs (some obvious, some subtle) and is just a generally confusing part of the language.

odersky commented 5 years ago

I like the idea to use case to signal a match that can fail (or filter, in the case of for expressions). My tendency would be to always write case in front of a pattern that is refutable. So it would be

    for (case (a, b) <- xs) ...
    for (case x :: xs <- xss)
    val case (a, b) = x
    val case x :: xs = ys

As opposed to

    case val (a, b) = x
    case val x :: xs = ys

case val is too close to case object, which means something different, IMO.

smarter commented 5 years ago

Or maybe case (a, b) = x without the val ?

odersky commented 5 years ago

That would be confusing on the right hand sides of match expressions

smarter commented 5 years ago

Right. In fact I think that having case (a, b) in a for-comprehension and val case (a, b) outside of one do different things (and one being safe and the other potentially crashing at runtime) is also confusing, so I'd go with:

for (case (a, b) <- xs) ...
for (case x :: xs <- xss)
val (a, b) @unchecked = x
val x :: xs @unchecked = ys

The @unchecked reminds you that the match is inexhaustive as usual, and it can be omitted if the match is in fact exhaustive.

odersky commented 5 years ago

I like that, too.

LPTK commented 5 years ago

To avoid breaking code that relies on the current behavior (the filtering behavior of for { (a, b) <- Seq(1 -> 2, 3 -> 4, 5) }), the exhaustiveness checker should be made much more stringent than it currently is, at least when checking for-comprehension patterns. It's also clear we want that additional strictness in destructuring val bindings.

So why not make exhaustiveness checking always strict by default, while we're at it?

What's the rationale for warning in this case:

scala> def f: List[Int] => Int = { case Nil => 0 }
1 |def f: List[Int] => Int = { case Nil => 0 }
  |                            ^
  |                            match may not be exhaustive.
  |
  |                            It would fail on pattern case: List(_, _: _*)

...but not warning in that case?:

scala> def f: Seq[Int] => Int = { case Nil => 0 }
def f: Seq[Int] => Int

I've seen users confused by the following code failing at runtime – an honest mistake IMHO, confusing Seq with List:

def f: Seq[Int] => Int = { case Nil => 0 case x :: xs => x }

To get rid of the warning, users who know better could use a special unchecked or fallible function:

def fallible[A,B](f: PartialFunction[A,B]): (A => B) = f
// (The very fact we can write the abvove function without warnings, because
//  PartialFunction[A,B] <: (A => B), seems itself broken, by the way.)

To be used as:

scala> def f: Seq[Int] => Int = fallible{ case Nil => 0 case x :: xs => x }
def f: Seq[Int] => Int
lihaoyi commented 5 years ago

@odersky @smarter I'd be happy for either of those syntaxes =)

odersky commented 5 years ago

@LPTK I don't think we can currently rely on the exhaustiveness checker for either of those cases. Irrefutability has to be established separately, but it should be much simpler for a single case.

Upgrading the exhaustiveness checker to detect all possible violations would be a long term project. We'd have to find someone to do this first.

I like the idea of fallible.