scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.8k stars 1.05k forks source link

Decide on the role of `:` in indentation syntax #7136

Closed odersky closed 4 years ago

odersky commented 5 years ago

The meaning of : for significant indentation is more contentious than other aspects. We should come up with a crisp and intuitive definition where : is allowed.

One way I'd like to frame significant indentation in Scala 3, is that braces are optional, analogously to how semicolons are optional. But what does that mean? If we disregard : it means:

At all points where code in braces {...} is allowed, and some subordinate code is required, an indented code block is treated as if it was in braces.

The second condition is important, since otherwise any deviation from straight indentation would be significant, which would be dangerous and a strain on the eyes. But with that condition it's straightforward: If some subordinate code is required, either that code follows on the same line, or it is on a following line, in which case it should be indented. Braces are then optional, they are not needed to decide code structure. In the following I assume this part as given.

So that leaves the places where some code is not required but we still like to insert braces. There are two reasonable use cases for this:

  1. Between the extends clause of a class, object, or similar and its (optional) template definitions in braces.
  2. In a partial application such as
    xs.foreach:
      ...

    The motivation for this second case is to make uses of library-defined operations close to native syntax. If native syntax allows to drop braces, there should be a way for library-defined syntax to do the same.

Possible schemes

The current scheme relies on colons at end of lines that can be inserted in a large number of situations. There are several possible alternative approaches I can see:

  1. Don't do it. Insist on braces for (1) and (2).

  2. Split (1) and (2). Make indentation significant after class, object, etc headers without requiring a semicolon. This has the problem that it is not immediately clear whether we define an empty template or not. E.g. in

    object X extends Z
    
     // ...
     // ...
    
     object Y

    it is hard to see whether the second object is on the same level as the first or subordinate. But semantically it makes a big difference. So a system like that would be fragile. By contrast, a mandatory : would make it clear. Then the version above would be two objects on the same level and to get a subordinate member object Y you'd write instead:

    object X extends Z:
    
     // ...
     // ...
    
     object Y

    So I actually quite like the colon in this role.

  3. Split (1) and (2) but require another keyword to start template definitions after class, object, etc headers. @eed3si9n suggested where. It's a possibility, but again I do like : at this point. It reads better IMO (and where might be a useful keyword to have elsewhere, e.g. in a future Scala with predicate refinement types).

  4. Keep : (or whatever) to start templates but introduce a general "parens-killing" operator such as $ in Haskell with a mandatory RHS argument. If that occurred at the end of a line, braces would be optional according to our ground rules.

    I fear that a general operator like that would lead to code that was hard to read for non-experts. I personally find $ did a lot more harm than good to the readability of Haskell code.

  5. Restrict the role of : to the two use cases above. I.e. allow a colon if

    • a template body is expected, or
    • in a partial application, after an identifier, closing bracket ] or closing parenthesis ).

    This way, we would avoid the confusing line noise that can happen if we allow : more freely.

My personal tendency would be to go for that last option.

LPTK commented 5 years ago

Regardless of whether we find a solution for (2), I would go for requiring braces for (1) — object, class, and trait definitions — which is what I proposed in the other thread.

But let me repeat and strengthen the rationale here:

odersky commented 5 years ago

@LPTK I believe that end markers are actually a far superior way to delineate class scopes. And having both braces and end markers would look weird.

JanBessai commented 5 years ago

Not sure if this is the correct thread, but: How does indentation sensitivity mix with triple quoted multi-line strings? Does the end-quote have to be indented? If so: how do you write down a line break + spaces before the end quote? Indent further?

sjrd commented 5 years ago

At the risk of having to my own devil's advocate in a week or so ... what about using with instead of : for the two use cases above.

For templates:

class Foo extends Bar with SomeTrait with
  def x: Int = 42

The with there does not shock me. It is even quite easy to interpret it as "braces as optional" if we also allow

class Foo extends Bar with SomeTrait with {
  def x: Int = 42
}

That syntax makes sense to me, as the contents of the block are added as members of Foo just as much as the members of SomeTrait. I don't know, it kind of makes sense to me.

For that, the lexer would introduce an indent if with is at the end of the line, the next line is indented and starts by anything but an identifier.

I would actually prefer to simply not have anything to open a template, though. However this might require the parser to feed into the lexer to implement, which is not ideal. Or maybe we can make it work by simply counting opening and closing brackets? That would be nice.

For method calls:

xs.foreach with
  x =>
    println(x)

val y = xs.foldLeft(0) with
  (prev, x) =>
    prev + x

val z = optInt.fold with
  println("nope")
with
  x => println(x)

This syntax had actually been proposed in the original "new implicits" proposals, for implicit parameter lists. So it must have appealed at some point. Except here it is used to pass normal arguments, when they are blocks (by-name params or lambdas).

The advantage over : is that, well, it's not :. There have been several concerns about the multiple problems that : exposes, not the least being the overloading wrt. type ascriptions.

with also suffers from overloading in this case, but perhaps it is less annoying than : because it would only have two meanings that are very well visually separated: with in a class header is composition; with in an expression is a block argument.

I don't really like what I'm proposing, but I dislike it a lot less than :.

jducoeur commented 5 years ago

I don't really like what I'm proposing, but I dislike it a lot less than :.

Strong agreement. I think : is terrible -- not only is it badly overloaded, it's simply too visually subtle. I'm concerned that it will lead to bugs due to people just missing it. I dislike this whole approach (IMO it crosses the line into designing a different language, not enhancing Scala, and enormously increases the risk of splitting the community) but probably this aspect most of all...

eed3si9n commented 5 years ago

In #7083 I wrote:

Since colon means type annotation (ascription) x: A already, reusing this to mean "begin bill of material" or "begin block" seems odd to me too.

F# uses =, and Haskell uses where:

module Main where  
  import A  
  import B  
  main = A.f >> B.f

(2) In a partial application such as xs.foreach:

I think with is an improvement over :, but it does suffer from overloading. Could we borrow <| from F# here?

xs.foreach <| x =>
  println(x)

ys.foreach <| y => println(y)

val y = xs.foldLeft(0) <| (prev, x) =>
  prev + x

If you squint, it looks like begin {. Note here that the indentation doesn't start until \n so you can write x => on the same line.

pattern match

Pattern matching is an odd one because <| is the opposite direction.

val kind = ch match
  case ' '  => "space"
  case '\t' => "tab"
  case _    => s"'$ch'-character"

You basically want |> but that might be too confusing:

val kind = ch |>
  case ' '  => "space"
  case '\t' => "tab"
  case _    => s"'$ch'-character"

I think we should just allow match to be a special indentation introducer.

(1) template definitions

What's interesting about template is that it is both a list of members and it's also the body of the constructor. So in that sense it might make sense to keep it the same syntax as block introducer.

class Contact(name: String) extends Bar with SomeTrait <|
  def x: Int = 42

  DB.append(name)

object Contact <|
  def apply: Contact = new Contact("")

I understand that = is not exactly right, but visually it looks less intimidating I think

class Contact(name: String) extends Bar with SomeTrait =
  def x: Int = 42

  DB.append(name)

object Contact =
  def apply: Contact = new Contact("")

I don't really like what I'm proposing, but I dislike it a lot less than :.

Ditto.

lihaoyi commented 5 years ago

I propose re-using existing keywords wherever possible. The whole point of this exercise is to make the syntax lightweight: having long keywords like where defeats the purpose entirely. if necessary, we could commandeer do as a short-enough keyword to delimit blocks in the case of ambiguity. I think : is common enough that overloading it would be confusing, and it really isn't that much shorter than do anyway.

Re-using existing keywords can get us surprisingly far. Consider Scalite:

package scalite.tutorial                                    package scalite.tutorial

class Point(xc: Int, yc: Int)                               class Point(xc: Int, yc: Int) {
    var x: Int = xc                                           var x: Int = xc
    var y: Int = yc                                           var y: Int = yc
    def move(dx: Int, dy: Int) =                              def move(dx: Int, dy: Int) = {
        x = x + dx                                              x = x + dx
        y = y + dy                                              y = y + dy
                                                              }
    override def toString() =                                 override def toString() = {
        "(" + x + ", " + y + ")"                                "(" + x + ", " + y + ")"
                                                              }
                                                            }
object Run                                                  object Run {
    def apply() =                                             def apply() = {
        val pt = new Point(1, 2)                                val pt = new Point(1, 2)
        println(pt)                                             println(pt)
        pt.move(10, 10)                                         pt.move(10, 10)
        pt.x                                                    pt.x
                                                              }
                                                            }
var x = 0                                                   var x = 0
for(i <- 0 until 10)                                        for(i <- 0 until 10) {
    val j = i * 2                                             val j = i * 2
    val k = j + 1                                             val k = j + 1
    x += k                                                    x += k
                                                            }
val list =                                                  val list = {
    for(i <- 0 to x) yield                                    for(i <- 0 to x) yield {
        val j = i + 1                                           val j = i + 1
        i * j                                                   i * j
                                                              }
                                                            }
list.max                                                    list.max
// 10100                                                    // 10100
val all = for                                               val all = for {
    x <- 0 to 10                                              x <- 0 to 10
    y <- 0 to 10                                              y <- 0 to 10
    if x + y == 10                                            if x + y == 10
yield                                                       } yield {
    val z = x * y                                             val z = x * y
    z                                                         z
                                                            }
all.max                                                     all.max
// 25                                                       // 25

I think this looks superior to using any delimiter. Note that the above already works in Scala 2.11/12/13, and has worked for 5 years now.

We have to use the same whitespace rules for both def/var/vals and class/trait/object. Having mixed rules for what things are "whitespace compatible" and what things aren't is a mess, and would look terribly ugly, especially in Scala where (unlike Java) having nested inner class/objects at the same level as your def/var/vals is common, and having top-level def/var/vals at the same level as your top-level class/objects is going to become common as well. If we want classes/objects to be "special", that ship sailed long ago.

One subtlety that we have to take care of is providing for higher-order methods. Code like:

val foo = bar
  .map{x => 
    val y = x + 1
    y + 1
  }
  .foreach{ x =>
    val y = x + 1
    println(y)
  }

is extremely common with Scala's method-chaining conventions, and I'd want to be able to provide a whitespace-compatible syntax:

val foo = bar
  .map x => 
    val y = x + 1
    y + 1
  .foreach x =>
    val y = x + 1
    println(y)

In Scalite I commandeered the do keyword for this purpose:

val xs = 0 until 10                                         val xs = 0 until 10
val ys = xs.map do                                          val ys = xs.map{
    x => x + 1                                                x => x + 1
                                                            }
ys.sum                                                      ys.sum
// 55                                                       // 55

val zs = xs.map do                                          val zs = xs.map{
case 1 => 1                                                   case 1 => 1
case 2 => 2                                                   case 2 => 2
case x if x % 2 == 0 => x + 1                                 case x if x % 2 == 0 => x + 1
case x if x % 2 != 0 => x - 1                                 case x if x % 2 != 0 => x - 1
                                                            }
zs.sum                                                      zs.sum
// 45                                                       // 45
val ws = xs.map do x =>                                     val ws = xs.map { x =>
    val x1 = x + 1                                            val x1 = x + 1
    x1 * x1                                                   x1 * x1
                                                            }
ws.sum                                                      ws.sum
// 385                                                      // 385

Since we seem to be deprecating do-while loops as I suggested 5 years ago, the do keyword is freed up. As a short, 2-character, now entirely unconflicted keyword, we should make sure we use it as effectively as possible.

For a longer example using what I propose, please take a look at this self-contained JSON parser:

I think it would be a good starting point to compare different syntaxes, being meaty enough to really give you a feel of things where trivial 10-line examples do not. The short examples are also illustrative:

odersky commented 5 years ago

I want to comment on sub-part (2), i.e. what to use for starting an indented argument. I believe that no single keyword would work well in that role.

with was borderline acceptable as an implicit argument, since it evokes pairing with a context. But for plain arguments it looks wrong at least as often as it looks OK. For instance:

math.logarithm with 
  val x = f(y)
  x * x

I believe with would only work well if the real function application is to something else and the part following with is in some way an accessory to that. E.g. in xs.map with f it works since the mapped argument is xs and f is an accessory. But the case of logarithm above shows that we cannot generalize that to all applications.

do evokes side effects. It is used now in that role specifically in while or for loops. E.g.:

for 
  x <- xs
  y <- ys
do
  println(x + y)

I believe for pure functions, do is out of place. E.g.:

transpose do
  val a: Matrix = ...
  val b: Matrix = ...
  a * b

In fact, the pattern of function application is so general and multi-faceted that no single keyword can do it justice. That's why Dijkstra uses infix point (well, that's taken already in Scala!) and Haskell uses $. But colon does work brilliantly in this role! Evidence #1 is Python, where it is perceived to be very natural. Yes, I know Python uses it everywhere instead of just for this purpose, but still... Evidence #2 is common language. It's very natural to write something like:

  To make a cake:
    preheat oven,
    mix flours, eggs and milk,
    bake for one hour.

The : specifically introduces a list of statements that's subordinate to a prefix clause. That's completely grammatical. So, I believe strongly that : is the best operator for this.

As to possible ambiguity with type ascription: I really don't think that's a problem. : at the end of line and : used infix are visually quite distinct. The only caveat is that we should not let one follow the other. I.e.:

  def f(): T:
     return foo

would be awkward. But ever since procedure syntax was dropped, Scala does not have syntax where this pattern could occur.

The only possible ambiguity is in a type ascription of an expression spanning multiple lines like this:

  someLongExpression:
    someLongType

But I have not seen code like this in the wild, and in fact our code base including all tools, all tests, and community build does not have a single instance where this pattern occurs. Why does it not occur? I guess if you have long expressions and types to combine you realize that you are probably better off factoring stuff into a val:

  val someId: someLongType = 
    someLongExpression
  someId

And, if you really need to write a multi-line ascription, you can aways do:

  someLongExpression
    : someLongType

which in fact reads much better. So, in summary, any ambiguity would be extremely rare, and is easily avoided. The other evidence why ambiguities are not a concern is again Python. Python does use : for both roles and the Python community is not known to be sloppy with syntax. Interestingly Python does not use : to indicate a function return type, since that would let them run into precisely the awkwardness I referred to earlier. It uses -> instead. But since Scala uses = instead of : to start a function body it does not run into that problem.

One downside of : is that it only works at the end of lines. So:

  xs.map: x =>
    val y = f(x)
    g(y)

does not work. You have to format it instead as:

  xs.map: 
    x =>
    val y = f(x)
    g(y)

I think this is not so bad. In real code the { x => part is often quite far to the right because the expression preceding it is long. This makes it hard to see the bound name. The vertical syntax makes it much clearer what is bound.

lihaoyi commented 5 years ago

I think this is not so bad. In real code the { x => part is often quite far to the right because the expression preceding it is long. This makes it hard to see the bound name. The vertical syntax makes it much clearer what is defined.

To me this formatting is he deal breaker, much more than the ambiguity around type ascriptions. I work with a lot of real code formatted exactly as you describe, and it reads excellently. If the LHS is long, the .map goes on a new line.

I have seen no code at all formatted similar to how you propose, and subjectively it looks awful. If it really looked better, people would already be formatting their lambdas like that right now, and they’re not.

odersky commented 5 years ago

@lihaoyi

I have seen no code at all formatted similar to how you propose, and subjectively it looks awful. If it really looked better, people would already be formatting their lambdas like that right now, and they’re not.

Fair point.

Here's a crazy idea for this pattern: use case! Examples:

xs.map case x =>
  val y = f(x)
  g(y)

xs.collect case Some(n) => n

xs.foreach case i => println(s"next: $i")

Points in favor:

To make this work we'd have to add one production to Expr1. It's the last line below:

Expr1 ::= ...
             |  Expr2 ‘match’ ‘{’ CaseClauses ‘}’
             |  Expr2 ‘case’ Pattern [Guard] ‘=>’ Expr

WDYT?

odersky commented 5 years ago

[Aside: You may have noted that I "ate my own dog food" in the comments above: every single indented section was introduced with :. Readers should judge for themselves whether this is natural or not.]

lihaoyi commented 5 years ago

case looks ok-ish to me, but that seems to introduce an even worse ambiguity: that between partial functions and total functions! This is something we had already made efforts to disambiguate in Dotty (e.g. requiring case for partial functions for for-comprehensions), so that case always means partiality, and partiality always means case. Requiring case just to make indentation work properly is definitely a step backwards

Honestly, I think we should just use do. The English meaning is almost irrelevant: with, class, object, for, etc. in Scala already mean vastly different things in English and Scala, and that's been mostly OK. Having do mean something specific in Scala isn't going to be the end of the world, especially since it's going to be entirely unambiguous and consistent: do will have no other meanings once do-while loops are out. I think it would definitely be preferable to overloading with or case or some other keyword which have very specific, existing Scala meanings.

I appreciate the desire to use :, but function-literals-with-arguments is a real sticking point. Python can get by with (1) not having multiline lambdas (2) a different -> foo syntax for annotating return types and (3) a dedicated syntax for context managers. Scala has none of these things, and passing multi-line n-arg function literals to higher order functions is the order of the day. We have to make sure it looks pretty and first class

odersky commented 5 years ago

I have strong objections against do. do universally means imperative side effect, in natural language as well as in all programming languages I know. (do comprehensions in Haskell model side effects via monads). So I believe we cannot simply change its meaning, in particular in a language like Scala which is predominantly functional but still allows side effects.

case is often used with partial functions but not always. A counter example is:

case class Point(x: Double, y: Double)

val points: List[Point] = ...
points.map {
  case Point(x, y) => ...
}

So, case is also used for destructuring in total functions. A simple variable pattern is a special case of that (and it's obvious at a glance that that's what it is)

lihaoyi commented 5 years ago

I'd argue that the vast majority of dos in any language are not in Haskell (or Java/C/C++/Javascript/etc., where do-while loops are uncommon) but in Ruby, which uses do exactly to delimit lambdas which can have a return value:

$ cat foo.rb
def my_map(array)
  new_array = []

  for element in array
    new_array.push yield element
  end

  new_array
end

result = my_map([1, 2, 3]) do |number|
  number * 2
end

puts result.to_s
$ ruby foo.rb
[2, 4, 6]

Here it is used exactly as I am proposing for Scala: to delimit a multiline lambda function taking parameters and returning a value, as a replacement for curlies.

result = my_map([1, 2, 3]){ |number|
  number * 2
}

The usage of do blocks in higher-order collection transformations is exactly as it would be in Scala:

$ cat foo.rb
result = [1, 2, 3, 4].reduce do |sum, i|
  x = i * i
  sum + x
end

puts result.to_s

$ ruby foo.rb
30

Honestly I find case for de-structuring a bit of a wart: we are already getting rid of the need for case-destructuring tuples in Dotty, which I think is a great step forward. Regardless, to me having case mean "partial functions and destructuring" is still a lot better than "partial functions and destructuring and multiline function literals". The last concept really has nothing to do with the first two, and the case keyword is completely out of place: it is taking a keyword with a well-known meaning, and making it required for something entirely unrelated

Ichoran commented 5 years ago

The niche a language occupies is a highly relevant consideration when choosing its syntax. Unless we want Scala to stop trying to occupy the "ultra-powerful type system" niche, where keeps pace with Haskell, do is a bad idea, even though the keyword is available.

I also think precedent is very important when deciding whether a language syntax is a good idea or not. I fully agree that : is completely natural linguistically--it's frequently used for introducing a following bunch of stuff. However, we're all really well-trained to see : as type ascription (either an actual type or a typeclass), and that makes me hesitant--even aside from genuine parse ambiguities--to repurpose it as a begin-block symbol.

With regards to case--well, if we must, okay, but like Li Haoyi I find it a wart in general. Rust does fine without it, and I wish we could too; or at least, I wish we could restrict it to only partial functions. Total functions, even with multiple match blocks, would be nicer without case.

Finally, I don't think my objections to having a bunch of different block-introductions have been adequately addressed. If we do pick something, I think it should be universal, even if it admits weird stuff. I would rather allow

foo:
  3
: 
  4

and insist on

if p then:
  foo
else:
  bar

than have to try to intuit which things require : and which do not. In addition to the keyword confusions I mentioned before, it also makes a sharper distinction between builtin language features and added syntax. One of the most beautiful features of Scala is that you can define your own libraries that feel perfectly seamless, as if they're a built-in part of the language. Having special rules for where : is used and where not would break this.

lihaoyi-databricks commented 5 years ago

I think it should be universal, even if it admits weird stuff. I would rather allow

foo:
  3
: 
  4

I think this would look pretty reasonable with a keyword like do:

foo do
  ???
do
  ???

Each indented do block is equivalent to a pair of braces. Although I wonder if it's possible to make the lexer/parser smart enough to omit the first do?

foo
  ???
do
  ???

This would look identical to a while-loop with a multiline condition:

while
  ???
do
  ???

Or if-else

if(???)
  ???
else
  ???

Perhaps even an if-else with a multiline condition:

if
  ???
do
  ???
else
  ???
if
  ???
then
  ???
else
  ???

That would essentially put user-land code syntactically on even footing with the builtin constructs, which seems like exactly what @Ichoran wants

lihaoyi-databricks commented 5 years ago

Unless we want Scala to stop trying to occupy the "ultra-powerful type system" niche, where keeps pace with Haskell

This seems like a reasonable thing to me. Scala is its own language with its own styles and conventions. I'm pretty sure this whole whitespace experiment was to try and emulate the approachability and widespread appeal of Python. We aren't trying to attract Haskellite's to Scala. That would make "keeping pace with Haskell" a non-goal altogether (though it still is unclear to me how the spelling of keyword affects the power of the type system)

odersky commented 5 years ago

@lihaoyi

After thinking a bit more about it I believe the case idea is indeed a bit crazy, since it risks overloading meanings, as you say. But do is not a workable choice either, for the reasons I stated. I believe Ruby's precendent is not a real counter-example: Ruby (and Smalltalk, from where the block syntax came) are at their heart imperative OO languages. So do is natural. Most closures passed in Ruby or Smalltalk would have side effects. This is no longer true in Scala.

We should also not invent many different constructs that mean the same thing.

So, one valid choice would still be: Do nothing. If you want an argument that is a multi-line lambda, use braces, or arrange the lambda vertically, as I had initially proposed. Sure, nobody does it like this now because it costs an extra line. But it might actually lead to clearer code.

If we must invent a parens killing operator, I think with is a reasonable choice, after all.

xs.map with x =>
  val y = f(x)
  g(y)

xs.collect with Some(n) => n

xs.foreach with i => println(s"next: $i")

It looks better than do for pure expressions like the map example above, but worse for side-effecting expressions like the foreach example. As a predominantly functional language, Scala should optimize for the pure scenario. My main concern about doing this is how to restrict it. All the examples above take lambdas, and that looks OK. But what about

xs.map with f

? You should be able to eta-reduce a lambda x => f(x) to just f without changing its context, so that would imply that this should be legal. But then, don't we also have to accept

sqrt with x + x

? People will write that sort of code to save a pair of parens! But that's where I think we have made matters worse, not better.

One possible choice would be to restrict the type of the right operand of with to some function type (or maybe accept call-by-name parameters as well?) That's a bit weird, since there is no syntactic reason for this restriction. But it might be a workable compromise.

In summary, I am still very much on the fence about all this, so the option of doing nothing for now (i.e. don't introduce a parens killing operator) looks reasonable to me. I have also learned that the two issues of using indentation for arguments and parens killing operators are not necessarily the same, since parens-killing operators will also be used on a single line.

odersky commented 5 years ago

One idea which might be attractive is to restrict with's right operand to call by name arguments or lambdas. That means, with is a visual cue that its argument is not strict. with introduces significant indentation by itself, so the following variation of @sjrd's example works:

val z = 
  optInt.fold with
    println("nope")
  with x => 
    val y = f(x)
    println(y)

The indentation is prompted in one case by with and in the other by =>.

People have often wished that {...} would indicate by-name arguments only. That did not work since we also want sometimes multi-statement by-value arguments, and {...} was the only way to get them. But with significant indentation we could introduce two operators that distinguish the two cases.

If we do this, another question is whether : should then apply only to by-value arguments, or whether we want to keep it general.

[EDIT:] I think we want to keep it general, since : would be equivalent to {...}, which is used for both by-name and by-value arguments. So the proposal would be to have a separate with construct that highlights by-name arguments.

lihaoyi commented 5 years ago

with suffers the same problem as case: it currently has a very specific meaning (mixin traits) and it is the wrong meaning for what we have here. It is also a very long keyword for what in my experience is a very common operation.

Here's two more ideas:

  1. How hard would it be to make do without a keyword? Could we use some combination of lenient parsing + post-validation to allow syntax like this?
val foo = bar
  .map x => 
    val y = x + 1
    y + 1
  .foreach x =>
    val y = x + 1
    println(y)

It seems we would need up to 1 line of lookahead in the lexer/parser, which seems like something that can be afforded. I haven't thought through all possible ambiguities, but it seems to me like with some fiddling this could work. We already do a lenient-parse+post-validation step in parsing lambda argument lists anyway.

  1. Could we introduce a new keyword? F# uses the fn keyword, which fits perfectly here:
val foo = bar
  .map fn x => 
    val y = x + 1
    y + 1
  .foreach fn x =>
    val y = x + 1
    println(y)

Short, unambiguous, and precisely meaningful. Sure it would be introducing a new keyword, but I think for such a common operation it is worth is v.s. overloading an existing keyword that doesn't really fit

odersky commented 5 years ago

I did some exploration, looking at actual usages of { ... => in our codebase. That made me more skeptical about with. For instance, pairing with exists is really bad:

  sym.baseClasses.exists with ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)

This is a LOT less clear than the original

  sym.baseClasses.exists { ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)
  }

(and do would be just as bad in its place). This shows again that no single keyword can express function application. The only thing one could do is have a keyword that marks the start of a function and that leaves application silent. case was one example of this, fn would be another. But it's still awkward to a degree where I would prefer we leave it in braces.

Can we use just :? Only if we severely restrict its use since otherwise we would get ambiguities with type ascription. E.g.

  sym.baseClasses.exists: ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)

would theoretically work since the type in a type ascription cannot be a naked function type without parens around it. But the longer the parameter list gets the harder it would be to parse (for humans).

lihaoyi commented 5 years ago

If : is unambiguous (assuming some amount of leniency/lookahead to see the => at end-of-line) then sym.baseClasses.exists: ancestor => seems to me like the best option so far.

It wouldn't be the first time : and => are parse differently depending on enclosing context and newlines/semicolon-insertion:

@ object foo{
    identity: Int => Int
    println("this is binding `identity` to a self-type of Int followed by a expression " + identity)
  }
defined object foo

@ foo
this is binding `identity` to a self-type of Int followed by a expression ammonite.$sess.cmd9$foo$@555856fa
res10: foo.type = ammonite.$sess.cmd9$foo$@555856fa
@ object foo{
    (identity: Int => Int)
    println("this is a type ascribing `Predef.identity` to a function type " + identity)
  }
cmd11.sc:3: missing argument list for method identity in object Predef
  println("but this one is " + identity)
@ object foo{
    val x = {identity: Int => Int}
    println("this is binding `identity` to an Int argument of a lambda which returns the Int companion " + x(1))
  }
defined object foo

@ foo
this is binding `identity` to an Int argument of a lambda which returns the Int companion object scala.Int

While not ideal, it seems to have caused little enough unhappiness in the past, so I wouldn't mind pushing the envelope a little bit and overloading : for opening indented blocks as well

Ichoran commented 5 years ago

I appreciate the effort, but don't think the results from any proposal are particularly visually pleasing or sufficiently general. I'm somewhat puzzled about why people aren't more concerned about the generality/simplicity.

I think we should have a good story about how to select when and where to put a brace-killer. I don't think memorizing a dozen or more cases is a good story. This suggests to me that the solution is that brace-killing happens always.

Maybe you require no syntax:

2 + 5 *
  3 + 8

Or maybe you do:

2 + 5 * :
  3 + 8

2 + 5 * ...
  3 + 8

2 + 5 * do
  3 + 8

but I don't think it works to have to pick and choose the cases. I think even if/then and for/yield are too much to remember if we're going for simplicity.

Python doesn't make you remember a pile of cases. From a random example on the web:

    temperature = float(input('What is the temperature? '))
    if temperature > 70:
        print('Wear shorts.')
    else:
        print('Wear long pants.')
    print('Get some exercise outside.')

Do you see that else:? The : is totally unnecessary because of the keyword else. And yet : provides consistency in introducing a new block indent.

Yes, it's just (redundant) punctuation. But punctuation is important:

do you see that else: the : is totally unnecessary because of the keyword else and yet : is required for consistency yes its just punctuation but punctuation is important

I think the discussion about case vs do vs with is ultimately much less important than allowing generality and avoiding context-sensitivity.

odersky commented 5 years ago

@ichoran Putting : behind else might work well but is out-of-scope for Scala since it would create a different dialect. I believe all we might be reasonably able to do is make braces optional. But we cannot require : where none was required before.

krakel commented 5 years ago

little joke I like this Pascal like syntax: ... : ... end It looks like: ... begin ... end // we should add this as new keyword Better than: ... { ... } end joke

These : used as begin of a indentation looks like an alias of a { without a corresponding }. You can show this with a simple \{ or {{ at the end of the line. The : should be separated with a space (xyz : not xyz:) for better reading.

object IndentWidth {{
    private val spaces = IArray.tabulate(MaxCached + 1) {{
        new Run(' ', _)
end IndentWidth
Ichoran commented 5 years ago

@odersky - I don't understand what you mean--surely having a brace-free style is making a different dialect? (Just like xs take 2 is a different dialect than xs.take(2)?)

Anyway, we would only require : to open a brace-free block, but we'd always require it to open a brace-free block.

In 2.12/13, else does not begin a block, so it would continue to work as it does now. You would have to use else: only to get a block. (One could argue that else: vs else is too subtle to notice, but the compiler could also insist on non-confusing indentation following else so that you couldn't make it look like a block with else and still compile.)

krakel commented 5 years ago

I should describe my idea better. For normal brace-free style you do not need a block opener. In the rare case where you need a block exceptionally you have to use a block opener. With the brace style we use a single { with block closer at the end. In the brace-free style, we need another block opener. The choice here uses the : .

The first time I saw that, I was a bit confused. The other suggestions, like do or with, do not feel better either. That's why I wanted to propose a block opener for the brace-free style, which is similar to the normal style. My suggestion here is the double {{ . (I know { on a brace-free style, but we also use () )

Both the single { and the double {{ carry the same message, start a new block, but the latter with brace-free style. The double {{ does not collide with other constructs either. Only when a user starts a block in a block in brace style should the compiler issue a warning (use spaces between the braces) .

aesteve commented 5 years ago

For partial application, I guess some languages (Elm?) do use \

elements.foreach \elem =>
    doSomeStuffWith(elem)
    println(elem)

elements.foldLeft(0) \elem, agg =>
    elem.toInt + agg

I did get use to it, some I'm obvisouly biased, but a bit curious if it could be a potential candidate.

jxtps commented 5 years ago

Just to clarify for us regular users who are not compiler / language experts, the issue with #2 is the ambiguity as to precisely what part of the indented code is meant for each part of the multi-parenthesis function? I.e.

def foo(i:Int)(j:Int):Int = i+j

foo
  println("Hello")
  1
  println("World")
  2
  3

is completely ambiguous, whereas:

foo {
    println("Hello")
    1
  } {
    println("World")
    2
    3
  }

is fully specified.

If this is indeed the issue, how about adding another level of indentation for when you go to the next set of parentheses, like so:

foo
  println("Hello")
  1
    println("World")
    2
    3

Or did #2 refer to something else?

jxtps commented 5 years ago

For #1, definitely go with Martin's initial second suggestion, "Make indentation significant after class, object, etc headers without requiring a semicolon. This has the problem that it is not immediately clear whether we define an empty template or not. E.g. in

object X extends Z

 // ...
 // ...

 object Y

it is hard to see whether the second object is on the same level as the first or subordinate."

Well that's a funny rejection of indentation based syntax! ;)

I'd argue that there's no delimiter that will make the "is object Y subordinate or not" all that clear when there's vertical distance between X and Y.

In fact, I haven't seen any substantive argument against this option, so we might as well go with the lightest syntax possible.

odersky commented 5 years ago

To clarify what I mean when I say we do not want to create a dialect: It requires that code that is currently well-indented means exactly the same with optional braces and without.

For instance,

if x == 0 then {
  println(x)
  println(".")
}

is well-indented. The braces can be dropped without changing its meaning.

jxtps commented 5 years ago

So to clarify, the concern is that

class C
  def foo = 1

could be interpreted as either:

class C
def foo = 1 // <- top level method, because these are allowed in dotty?

or

class C {
  def foo = 1
}

So the ambiguity arises from the possibility that a sloppily indented file written in the braces era would mean something different in the indentation era?

Based on "code that is currently well-indented means exactly the same with optional braces and without" my interpretation must be incorrect, so there is some other ambiguity that I don't see?

krakel commented 5 years ago

@jxtps

foo {
  println("Hello")
   1
 } {
   println("World")
   2
   3
 }

your code should be written in brace-less style:

foo
println("Hello")
1
:   // <- start a new brace-less block
println("World")
2
3
jxtps commented 5 years ago

@krakel Right, and that extra : looks... terrible? Hence my suggestion for an additional level of indentation instead.

It feels reasonably natural in the world of indentation: a new level of indentation means a new block, which is the same as putting braces around that part of the code, which is what we were looking for here.

Ichoran commented 5 years ago

@jxtps - bippy{a}{b} is not the same as bippy{a{b}}, and the latter needs to be easily encoded, so your idea won't work.

jxtps commented 5 years ago

Excellent point @Ichoran, thank you! The syntax I suggested would be more suitable for the bippy{a{b}} case. For the bippy{a}{b} case we need to allow ending an indentation block only to start a new one. We already have an end token. Reuse it like so:

foo
  println("Hello")
  1
end
  println("World")
  2
  3
odersky commented 5 years ago

@jxtps Yes, it comes down to what we regard as properly indented. So my comment above was misleading. I have edited it now to drop the contentious part.

jxtps commented 5 years ago

"Braces optional" is an interesting way of thinking of the problem. What if we also think of it from the mirrored perspective: "indentation means braces", i.e. wherever there is (single level) indentation, imagine there are braces around it - no new keywords or delimiters (:, ..., do, where, ...). What happens then, and what would need to change to make that work?

Let's consider two rules:

  1. Single indentation means braces.
  2. Double+ indentation means continuation of the last line (like ending the previous line with \ in e.g. bash) - the intention here is to meaningfully line up the next line's code with a suitable column of the preceding line.

If we look at the meat of https://github.com/lampepfl/dotty/pull/7024#issuecomment-528773997 (yes I dropped the new then, do etc since they're superfluous for multi-line cases):

if  x < 0 && 
    y > 1  // <-- extra indent means continuation of last line = it's "inside the ()"
  -x 
else 
  x

while x >= 0 && 
      y > 1 // <-- extra indent means continuation of last line = it's "inside the ()"
  x = f(x)

for x <- xs 
    if x > 0 // <-- extra indent means continuation of last line = it's "inside the ()"
yield 
  x * x

for x <- xs
    y <- ys // <-- extra indent means continuation of last line = it's "inside the ()"
  println(x + y)

Since the "signature" of the control flow statements is roughly while(cond) { block } then we can see how this naturally translates to library code of similar signatures.

Something along these lines avoids excess keywords/tokens and is very "airy", but there's of course a concern with distinguishing between two levels of indentation: will users grok and realize the difference? And it probably violates the "no new dialect" doctrine.

We'd need to figure out at least how it would work in these additional cases (some of which have been covered above):

// General function invocations, with simple arguments f(1,2), blocks f({a},{b}) & blocks taking arguments f(x => ...): 
f(x, y)
f(x => a, y)
f(x => a, y => b)
f(x)(y)
f(x => a)(y)
f(x => a)(y => b)
f(g(x))
f(x => g(y))
f(x => g(y => b))

// When code overflows one line: 
val myReallyLongVariableName: MyReallyLongClassName = MyWayTooLongInitializer0 * 
  MyWayTooLongInitializer1 // How far do we indent? Min/max? What's clear?

// When chaining method calls (which is a version of code overflowing a line): 
xs
  .map(_.foo()) // How far do we indent? Min/max? What's clear?
  .flatMap(_.bar())
  ...

// When a method takes a lot of parameters, for clarity (also code overflowing a line)
xs.foo(
  a = 10, // How far do we indent? Min/max? What's clear?
  b = false, 
  ...
)

// There are also some code conversion concerns: 
val a = x * 
  y + z // With "old school indent" this risks being interpreted as x * { y + z } instead of x * y + z!

Maybe I'm off in space here (no pun intended, really), but on some level this entire indentation thing is too, so yeah, it would be great if we could get the benefit of braces optional without too much additional keyword / token baggage. And you can't get fewer tokens than 0 (not counting whitespace ;)

odersky commented 5 years ago

I have a hard time believing that "indentation means braces" would work without creating a dialect.

I propose the following procedure instead:

  1. Decide on rules to check that today's Scala code with braces is properly indented
  2. Implement these rules in the compiler or a format checker.
  3. Identify places where braces would be redundant for well-indented Scala programs. I.e indentation and braces always agree.
  4. Allow to drop the braces in these situations.

That way we are sure not to create a dialect: Some braces are simply optional for correct Scala programs, the same way some braces are optional now.

odersky commented 5 years ago

I have implemented in #7185 a version that does not require a : after a class or object signature. I believe it is overall an improvement. It does require we tighten rules for well-indented programs. In particular:

  1. All definitions inside a class or similar have to be indented relative to the start of the class.
  2. The definitions following a class without a body cannot be indented relative to the start of the class.

I have seen some reasonable code violating the second requirement. For instance:

trait A
  case class B() extends A
  case class C() extends A

This would have to be outlawed since otherwise the code above would be regarded as

trait A {
  case class B() extends A
  case class C() extends A
}

So we would have to make at least one controversial restriction on what is well-formed indentation to make optional braces behind class signatures work. Still, I think it is worth doing that.

jxtps commented 5 years ago

If we have to have a token to tell when we go brace-less in a method invocation (which it seems like we have to), then there doesn't seem to be an ideal candidate, but : seems to be the least bad option, with the addition of allowing xs.foreach: x => to start <indent> like @lihaoyi suggested.

There's lots of weirdness in the syntax when you start enumerating all the combinations of functional forms and parameter styles. This list still needs to be worked through:

f(x, y)
f(x => a, y)
f(x, y => b)
f(x => a, y => b)
f(x)(y)
f(x => a)(y)
f(x)(y => b)
f(x => a)(y => b)
f(g(x))
f(x => g(y))
f(x => g(y => b))

Edit: Actually, shouldn't xs.foreach x => be a legal start of a braceless block on it's own, as braces are allowed and subordinate code is required after it? So no need for the : there.

Still, where needed, the : seems to be the least bad option.

Ichoran commented 5 years ago

@odersky - Keep in mind that if a block is, in isolation, well-indented and braces are redundant, it does not follow that this remains true after braces are removed from neighbors. For instance,

foo{
  println("parameter block 1")
  p
} {
  println("parameter block 2")
  q
}

meets all the criteria for both blocks, as far as I can tell (modulo some choice about how } { should appear--one line or two, etc.). And each single removal works:

foo {
  println("parameter block 1")
  p
}
  println("parameter block 2")
  q

foo
  println("parameter block 1"
  p
{
  println("parameter block 2")
  q
}

But obviously both together is a problem:

foo
  println("parameter block 1")
  p
  println("parameter block 2")
  q
jxtps commented 5 years ago

@Ichoran I thought we'd have to have : to start a braceless block for most method invocations - otherwise you also have problems like this:

// a = (x * y) + z in scala2
val a = x *
   y + z

// but interpreted as a = x * (y + z) in scala3
val a = x * {
  y + z
}

Still not clear on how to write the invocation you describe in braceless style - none of the forms I've seen jump out as being all that great:

foo: 
  println
  p
: // Variations: ":", "end", "end:" - all have issues. 
  println
  q
flavomulti commented 5 years ago

How do we handle fold when there are 2 parameter lists?

val value = Option(12)
val x = value.fold(10)(v => v+1)

I'm not sure what the approach is, I've not had anything work. For this trivial case I would just keep it on one line of course but I sometimes have 2-3 lines in the 2nd position.

val x = value.fold:
  10
???
Ichoran commented 5 years ago

Well, I think the conclusion is that only the last block can be brace-optional.

val sum = xs.foldLeft(0)
  (acc, x) => acc + x

and that explicit implicit parameters require braces (using a 2.12 example):

xs.map{ p =>
  val q = foo(p)
  baz(q, bar(q))
}(cbf)
aappddeevv commented 5 years ago

While I am not against it, it feels inconsistent. Perhaps there are some more advanced rules around : to enable optional braces in this scenario. Seems like this case should be covered somehow re: your examples above as well. I'm beginning to think that the "braces optional" concept feels like a slippery slope vs an all out removal approach. Having said that, in my converted-to-indents code base, I still use some braces for short expressions.

lihaoyi commented 5 years ago

Well, I think the conclusion is that only the last block can be brace-optional.

Using a keyword to delimit blocks allows multi-param-list brace-free function calls to look OK in my opinion:

val sum = xs.foldLeft fn
  0
fn (acc, x) => 
  acc + x

xs.map fn p =>
  val q = foo(p)
  baz(q, bar(q))
fn
  cbf
val sum = xs.foldLeft do
  0
do (acc, x) => 
  acc + x

xs.map do p =>
  val q = foo(p)
  baz(q, bar(q))
do
  cbf

This would be a trivial desugaring to {}s:

val sum = xs.foldLeft {
  0
} { (acc, x) => 
  acc + x
}

xs.map { p =>
  val q = foo(p)
  baz(q, bar(q))
} {
  cbf
}

Using : only is problematic because : alone on its own line looks weird:

val sum = xs.foldLeft :
  0
: (acc, x) => 
  acc + x

xs.map : p =>
  val q = foo(p)
  baz(q, bar(q))
:
  cbf

Making these multi-arg-list function calls pretty is one of the reasons I think using a two-character keyword like do or fn is superior to using an operator like :, even though neither do nor fn's english meanings are 100% on target

aappddeevv commented 5 years ago

I could get used to the last sum with : I think. But not every 2nd Param list is a function and it would look awkward if it is just a short expression in which case why use indent syntax. I’m beginning to think that this syntax is good at getting rid of obviously redundant braces and braces for blocks with more than one statement.

Ichoran commented 5 years ago

@lihaoyi - I was trying to adhere to Martin's new "strict elision" policy--that is they are optional braces with no extra syntax, to make it as little of a language split as possible.

If the feature is merely optional braces, nothing else, then only the last block could work since the syntax is otherwise ambiguous.

aappddeevv commented 5 years ago

@odersky I've been using the new syntax since it was released. It was easy to pickup and requires almost no thinking on my part to switch, which is great. The rules for how to use it are also easy to understand.

So far, the multiple parameter list issue causes the most "didn't expect that" moments.