scala / scala3

The Scala 3 compiler, also known as Dotty.
https://dotty.epfl.ch
Apache License 2.0
5.89k stars 1.06k forks source link

Consider syntax with significant indentation #2491

Closed odersky closed 7 years ago

odersky commented 7 years ago

I was playing for a while now with ways to make Scala's syntax indentation-based. I always admired the neatness of Python syntax and also found that F# has benefited greatly from its optional indentation-based syntax, so much so that nobody seems to use the original syntax anymore. I had some good conversations with @lihaoyi at Scala Exchange in 2015 about this. At the time, there were some issues with which I was not happy yet, notably how to elide braces of arguments to user-defined functions. I now have a proposal that addresses these issues.

Proposal in a Nutshell

Motivation

Why use indentation-based syntax?

This solves the alignment issue: The if and the elses are now vertically aligned. But it gives up even more control over vertical whitespace.

Impediments

What are the reasons for preferring braces over indentations?

But neither of these points are the strongest argument against indentation. The strongest argument is clearly

Proposal in Detail

Expanded use of with

While we are about to phase out with as a connective for types, we propose to add it in two new roles for definitions and terms. For definitions, we allow with as an optional prefix of (so far brace-enclosed) statement sequences in templates, packages, and enums. For terms, we allow with as another way to express function application. f with { e } is the same as f{e}. This second rule looks redundant at first, but will become important once significant indentation is added. The proposed syntax changes are as described in this diff.

Significant Indentation

In code outside of braces, parentheses or brackets we maintain a stack of indentation levels. At the start of the program, the stack consists of the indentation level zero.

If a line ends in one of the keywords =, if, then, else, match, for, yield, while, do, try, catch, finally or with, and the next token starts in a column greater than the topmost indentation level of the stack, an open brace { is implicitly inserted and the starting column of the token is pushed as new top entry on the stack.

If a line starts in a column smaller than the current topmost indentation level, it is checked that there is an entry in the stack whose indentation level precisely matches the start column. The stack is popped until that entry is at the top and for each popped entry a closing brace } is implicitly inserted. If there is no entry in the stack whose indentation level precisely matches the start column an error is issued.

None of these steps is taken in code that is enclosed in braces, parentheses or brackets.

Lambdas with with

A special convention allows the common layout of lambda arguments without braces, as in:

xs.map with x =>
  ...

The rule is as follows: If a line contains an occurrence of the with keyword, and that same line ends in a => and is followed by an indented block, and neither the with nor the => is enclosed by braces, parentheses or brackets, an open brace { is assumed directly following the with and a matching closing brace is assumed at the end of the indented block.

If there are several occurrences of with on the same line that match the condition above, the last one is chosen as the start of the indented block.

Interpreted End-Comments

If a statement follows a long indented code block, it is sometimes difficult as a writer to ensure that the statement is correctly indented, or as a reader to find out to what indentation level the new statement belongs. Braces help because they show that something ends here, even though they do not say by themselves what. We can improve code understanding by adding comments when a long definition ends, as in the following code:

    def f =
       def g =
          ...
          (long code sequence)
          ...
    // end f

    def h

The proposal is to make comments like this one more useful by checking that the indentation of the // end comment matches the indentation of the structure it refers to. In case of discrepancy, the compiler should issue a warning like:

// end f
~~~~~~
misaligned // end, corresponds to nothing

More precisely, let an "end-comment" be a line comment of the form

// end <id>

where <id> is a consecutive sequence of identifier and/or operator characters and <id> either ends the comment or is followed by a punctuation character ., ;, or ,. If <id> is one of the strings def, val, type, class, object, enum, package, if, match, try, while, do, or for, the compiler checks that the comment is immediately preceded by a syntactic construct described by a keyword matching <id> and starting in the same column as the end comment. If <id> is an identifier or operator name, the compiler checks that the comment is immediately preceded by a definition of that identifier or operator that starts in the same column as the end comment. If a check fails, a warning is issued.

Implementation

The proposal has been implemented in #2488. The implementation is quite similar to the way optional semicolons are supported. The bulk of the implementation can be done in the lexical analyzer, looking only at the current token and line indentation. The rule for "lambdas with with" requires some lookahead in the lexical analyzer to check the status at the end of the current line. The parser needs to be modified in a straightforward way to support the new syntax with the generalized use of with.

Example

Here's some example code, which has been compiled with the implementation in #2488.

object Test with

  val xs = List(1, 2, 3)

// Plain indentation

  xs.map with
       x => x + 2
    .filter with
       x => x % 2 == 0
    .foldLeft(0) with
       _ + _

// Using lambdas with `with`

  xs.map with x =>
      x + 2
    .filter with x =>
      x % 2 == 0
    .foldLeft(0) with
      _ + _

// for expressions

  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  yield
    x + y

  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  do
    println(x + y)

// Try expressions

  try
    val x = 3
    1.0 / x
  catch
    case ex: Exception =>
      0
  finally
    println("done")

// Match expressions

  xs match
    case Nil =>
      println()
      0
    case x :: Nil =>
      1
    case _ => 2

// While and Do

  do
    println("x")
    println("y")
  while
    println("z")
    true

  while
    println("z")
    true
  do
    println("x")
    println("y")

  // end while

// end Test

package p with

  object o with

    class C extends Object
               with Serializable with

      val x = new C with
          def y = 3

      val result =
        if x == x then
          println("yes")
          true
        else
          println("no")
          false

    // end C
  // end o
smarter commented 7 years ago

Audacious :). A few things that come to mind:

smarter commented 7 years ago

It would also be interesting to compare this proposal with the existing https://github.com/lihaoyi/Scalite

odersky commented 7 years ago

how do you interpret the following code?

new X with
  Y

As new X with {Y}. So you should not write code like that. I believe it's really bad code formatting anyway, you should have written

new X
  with Y

scalafix will be able to help, I am sure.

smarter commented 7 years ago

Taking a step back, what is the with for anyway? What prevents us from being able to write:

xs.map x =>
  x + 2

xs.collect
  case P1 => E1
  case P2 => E2
lihaoyi commented 7 years ago

Woo!

Some thoughts:

Providing visual cues

To be honest, I think that two-space indents is the main culprit for making the given example hard to read:

def f =
  def g =
    def h =
      def i = 1
      i
  def j = 2

Using 3 or 4-space indents, it's much clearer in most fonts:

def f =
    def g =
        def h =
            def i = 1
            i
    def j = 2

That is the reason all my Scalite examples use 3-space or 4-space indents. Two-space indents works if you have curlies but in most fonts is confusing for indentation-delimited blocks.

I've also experienced working on a large Coffeescript codebase (also indentation-delimited) with 2-space indents, and basic block scoping was definitely confusing and easy to misread compared to our 4-space-indented Python codebase.

w.r.t. Editor support, many editors already support indentation-highlighting to some extent, e.g. IntelliJ's {Ctrl,Cmd}-W "Extend selection " and Sublime Text's Shift-Cmd-J "Expand Selection to Indentation". Going to the start/end of a block is then a matter of highlighting it and then pressing left/right, and presumably it wouldn't be hard to add a shortcut if we wanted it to be a single command.

I think with is too verbose a keyword to use for this purpose

Consider the difference between OCaml style

let foo = bar in 
let baz = qux

vs Java-style

int foo = bar; 
int baz = qux

It's only one character, but I think it makes a huge difference. Python uses : which is close to ideal; in Scala we can't because : is reserved for type ascriptions, but I think it's worth looking for something lighter weight than with

I think functions and classes have a nice symmetry we shouldn't break

def foo(i: Int, s: String = "") = {
  def bar() = s * i
  println(bar())
}

class Foo(i: Int, s: String = ""){
  def bar() = s * i
  println(bar())
}

foo(3, "lol")
new Foo(3, "lol")

Both take zero-or-more argument lists, both take a block of statements, both run every statement in the block the block. The main "semantic" difference is that functions return the last statement, whereas classes return the entire scope (ignoring "invisible" differences like bytecode representation, initialization-order limits, etc.)

Given the symmetry, If we're going to allow multiline functions with =

def foo(i: Int, s: String) =
  def bar() = s * i
  println(bar())

I think it makes sense to allow multi-line classes with the same syntax

class Foo(i: Int, s: String) =
  def bar() = s * i
  println(bar())

Notably, this is the syntax that F# has chosen https://fsharpforfunandprofit.com/posts/classes/

Given that multi-line while conditionals and for-generators are allowed, what about if-conditionals

Will this work?

if                                                          if ({
    println("checking...")                                    println("checking...")
    var j = i + 1                                             var j = i + 1
    j < 10                                                    j < 10
do                                                          }) {
    println("small")                                          println("small")
    1                                                         1
else                                                        } else {
    println("big")                                            println("big")
    100                                                       100

I think it should, for symmetry

I think that we should be able to leave out with in lambdas

xs.map x =>
  x + 2

xs.collect
  case P1 => E1
  case P2 => E2

As @smarter mentioned. It might take some hacks (since we don't know it's a lambda until we've already fully tokenized the argument list, and we'll need to go back in time to insert the synthetic curly). Not sure if there's a technical reason it can't be done, but syntactically I think it's unambiguous to a reader, since lambda-argument-lists don't tend to be that long.

odersky commented 7 years ago

@lihaoyi Thanks for your long and constructive comment. Some recations:

To be honest, I think that two-space indents is the main culprit for making the given example hard to read.

You might well be right. The proposal is silent about how many spaces are to be used. It's a separate discussion, but one which is entangled with the current one.

I am against using = for starting a class body because it's semantically wrong. A class is certainly not the same as its body of declarations. Instead, a class definition introduces a new entity which comes with some declarations.

Multi-line if condition: sure, let's add it.

Leave out with in lambdas: What would the grammar be for this? One of the advantages of the current proposal is that it does not need parser hacks. A smarter lexical analyzer is all that's needed. Also, I think it's important to guard against accidental off-by-one indentations turning into significant nesting.

RichyHBM commented 7 years ago

Just to make sure, IF implemented, would this be an optional flag/command passed to the compiler to indicate this is an indentation file, or would it be more like python where you are forced to use indentation?

Whilst python's indentation based syntax is cool in theory, I have found it to be rather annoying in practice. If you use various different text editors/IDEs that may treat spaces/tabs differently it means you can break your code just by opening it up in a new editor.

You might well be right. The proposal is silent about how many spaces are to be used. It's a separate discussion, but one which is entangled with the current one.

I know this isn't the discussion for the amount of spaces, and I don't intend it to be that. But it is a very real issue that many people are passionate about and I feel it would cause lots of issues. The possible alternative would be to just use tabs and allow the users to set them to the amount they prefer, but then you are probably looking at a space vs tabs discussion..

And just to add to this, Scala is a language that is already criticized as complicated for new comers, I feel adding another restriction would just add to this. Currently you can write your code as you like but having weird code issues based on you not having placed indentation in the right place is just another hurdle.

Jasper-M commented 7 years ago

I have to agree the indentation based syntax is pretty nice on the eyes in a trivial example. But keeping in mind the cost of change and the fact that indentation based is not strictly better than curly braces based, I think it's probably better not to add it.

Sciss commented 7 years ago

An exciting proposal. I can follow most arguments. The two things that I don't like is 'lambdas with with' and 'interpreted end-comments'. The first one frankly just looks awkward, so I would be with @lihaoyi here to not use the with keyword in this case. Also, the number of alternative ways of writing a map just explodes:

xs.map(x => ...)
xs.map { x => ... }
xs map { x => ... }
xs.map with x => ...
xs map with x => ... // ?

When I look through the examples, the lambda ones are the ones that stand out as outlandish to me.

The interpreted-end-comments I find very problematic. You are instructing the compiler now to make sense of line comments. I also think this will make life miserable for parser, editor and IDE authors. People can still use curly braces, no? Then we can stick to braces if the block has many lines that make it difficult to see the end. As I understand, there will be already at least a period where both curly braces and indentation are allowed. My guess is, people will use indentation for short (< 10 lines) blocks, and keep braces for longer blocks. Perhaps this mix is actually a good solution in terms of readability. If not, why not use a real keyword end, such as

def f =
   def g =
      ...
      (long code sequence)
      ...
end f

def h

?

As several people have said, using indentation almost certainly means we will have to use at least three spaces to come out clearly and visibly, Perhaps this setting could be a scalacOption with an optional (linter) switch to emit warnings if the code uses other indentation size.

Overall, I think this is great and I hope it will be implemented. People (including myself, I think) have argued against this when Python came along, but people have also gotten used to it, and I think the success of Python and the cleanness of Ocaml/F# clearly speak in favour of reducing the visual clutter.

I'm looking forward to having if then constructs and the possibility to have multi-line conditions in if and while without the awkward ({ ... }).

lihaoyi commented 7 years ago

Three more notes:

I agree that // end comments look terrible

If we want them to be a part of the syntax, we should introduce syntax for them. Or we could let people use curly-brackets optionally.

I think both cases aren't necessary; in Python people get by just fine without end delimiters, but anything is better than // end comments!

How optional will this syntax be?

Will it be a per-project setting? Per-file? Perhaps even more fine-grained, where you can mix in curlies and indentation-based syntax?

Can we get rid of do-while loops?

This

 do
    println("x")
    println("y")
  while
    println("z")
    true

Is isormorphic to the while-loop

while
    println("x")
    println("y")
    println("z")
    true
do
    ()

In fact, the while-loop version is superior, because the while block in a do-while does not have access to the things in the while block, whereas in the while-do-() case the conditional has everything defined in the while block in scope. This has proven extremely useful to me on many occasions, and given that this new syntax (potentially) makes while-do-() as nice to use as do-while, I think we can really just drop the do-while syntax case entirely

Sciss commented 7 years ago
while
   println("x")
   println("y")
   println("z")
   true
do
   ()

Sorry, but that is horrible. That reminds me of C code where everything is written in the for statement, like for {do some stuff) ;;

lihaoyi commented 7 years ago

We could let the syntax leave out the do

while
   println("x")
   println("y")
   println("z")
   true

Not so bad? At least I think it's not too bad...

bmjsmith commented 7 years ago

This offers no objective improvement to the language at a cost that is not insignificant. More overloaded keywords is the last thing that helps newbies and supporting two styles or switching between them is burdensome. Developers in general will not reach a consensus on indentation vs delimiters any more than they will on tabs vs spaces or which line your curly brackets go on. Please don't facilitate wasting effort debating this (or having to switch) in every project and leave it as it is.

Jasper-M commented 7 years ago

Also, what about the "scalable language" thing, where you can implement your own language constructs?

def until(pred: =>Boolean)(body: =>Unit) = while (!pred) body

And then

var i = 0
until (i == 10) {
  i += 1
  println(i)
}

Does that work as seamless with indentation as well?

Edit: to answer my own question: not in the current proposal. You would have

var i = 0
until (i == 10) with
  i += 1
  println(i)
optician commented 7 years ago

About motivation:

Cleaner typography

With whitespace approach you'll definitely use at least 4-space indent that cuts off available horizontal space. Line breaks would appear more often.

Regain control of vertical white space.

Please could you provide example?

Ease of learning. There are some powerful arguments why indentation based syntax is easier to learn.

Opinion based. My experience with novices and all histories I heard don't have evidences of such problem.

Less prone to errors.

Indentation error is new syntax error for language. So it difficult to say it less prone to errors. Working in REPL also would be harder.

Easier to change.

Agree. But imho one case is not big deal.

Better vertical alignment.

Agree. I use the same style. So problem is absent.

shawjef3 commented 7 years ago

There is another impediment: Replacing braces with whitespace breaks the principle of don't-repeat-yourself, which I believe most code authors believe in. Braces are a factoring on lines of code to say they are in a particular scope. Removing braces means that each line defines what scope it is in.

reverofevil commented 7 years ago

Great news.

If certain keywords are followed

Could it be simplified to a much simpler rule: enclose every extra level of tabulation with a pair of curly braces? I've implemented significant whitespace this way in my own similar language, and had no issues with context at all.

Susceptibility to off-by-one indentation

That's easily solved by also enforcing tabs at the beginning of the line. As a side effect, there would be no more need to have an exact count of spaces per tab in style guides.

odersky commented 7 years ago

Wow, this proposal has generated a lot of heat (should have expected that!) I think for now my proposed strategy will be:

Once the experiments are in, decide on whether we want to keep this.

ivan-klass commented 7 years ago

I like the idea of indent syntax. However, in a big project it can be hard to change all the codebase at once, so I think this compiler option should be controlled per-file. Also we can think of more powerful syntax migration mechanism or convention, for example, like "from future import " in Python. As for "with" for lamdas, it personally looks kinda awkward to me, but acceptable.

Blaisorblade commented 7 years ago

👍 on some context-sensitive syntax. It might also be a good occasion to try revisiting other issues with Scala syntax...

[...] That rule proves to be quite constraining (for instance it would outlaw the chained filter and map operations in the example below), so it is currently not implemented.

I'm afraid of too much flexibility, based on my experience with layout-sensitive syntax in Haskell.

I can't fully pinpoint the problem there, and I'm sorry this isn't too constructive yet, but I know that indenting Haskell code, and offering editor support for it, is a rather nontrivial matter (I've used some Emacs indentation modes, and I've never been fully happy with any of those—see http://haskell.github.io/haskell-mode/manual/13.16/Indentation.html for a list).

The topic is so tricky that one of these indentation modes deserved a 10-page JFP paper (which I haven't studied), "Dynamic tabbing for automatic indentation with the layout rule", http://www.cs.tufts.edu/~nr/cs257/archive/guy-lapalme/layout.pdf.

TL;DR. I'm happy if we don't drown in flexibility. I could be fine with following a style guide—as long as the rules allow for satisfactory styles.

EDIT: please tag me if you want to reply to me and have me read your answer—gotta unsubscribe from this busy issue.

EncodePanda commented 7 years ago

OH "While we at it, can we add dynamic typing as well?" :)

ghost commented 7 years ago

In this same mindset, we should go one step further and remove lane delimiters from streets. Also stop, yield signs, traffic lights, and create implicit rules that people can follow in their heads. That would make streets look more elegant.

EncodePanda commented 7 years ago

That was a joke, lol.

obask commented 7 years ago

I feel like forcing number of indents is a terrible idea: Haskell and LISP use num of spaces equal to previous line keyword, like: (do-some-stuff people ___user) Instead of: (do-some-stuff people __user)

jpallas commented 7 years ago

I think @lihaoyi's point above about with and : cannot be overemphasized. Python's syntax works in large part because : (at end of line) reliably signals a new block and does so in a visually distinctive way that a keyword cannot. But Python doesn't have the challenge of blocks within an expression. There might simply be no good way to do this for Scala.

lihaoyi commented 7 years ago

Wow, this proposal has generated a lot of heat (should have expected that!)

Surely this is expected. The response is actually milder than I'd expect; imagine submitting a curly-brace PIP to the Python community!

jpallas commented 7 years ago

imagine submitting a curly-brace PIP to the Python community

>> from __future__ import braces
File "<stdin>", line 1
SyntaxError: not a chance
kmizu commented 7 years ago

Certainly, the new syntax proposal is great. At the same time, the current style is not so problematic in real Scala world. Such a big change should not be introduced to solve a little problems in my opinions.

pkolaczk commented 7 years ago

Generally I like this proposal. Braces offer actually too much flexibility on how things can be formatted and make a lot of visual noise.

Indent based nesting is more natural, because it is also the way how we structure content in written natural language.

What I don't like about this proposal:

Also +1 for an automated formatter.

lihaoyi commented 7 years ago

Thinking about this a bit more, I think if we're going to have to overload a keyword to annotate significant indentation, we should overload : rather than with. Thus, any trailing : on a line will (backwards-incompatibly) be taken to mean the opening of a new indentation-delimited block, rather than a type ascription.

: it's something that many people will already be familiar with from Python, is concise, distinct (not just another english keyword in a sea of english keywords) and is relatively neutral with regard to what it "means": it just means a new block-scope, without any english-connotations to worry about.

Furthermore, I think lines with trailing withs are probably more common in existing Scala code than lines with trailing :, though it's an empirical question whether that's through throughout the community.

It's not great to overload keywords, but if we're going to be overloading something anyway I think overloading : makes a lot more sense than overloading with

package p:

  object o:

    class C extends Object
               with Serializable:

      val x = new C:
          def y = 3

      val result =
        if (x == x):
          println("yes")
          true
        else
          println("no")
          false
  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  yield
    x + y

  for
    x <- List(1, 2, 3)
    y <- List(x + 1)
  do
    println(x + y)

// Try expressions

  try
    val x = 3
    1.0 / x
  catch
    case ex: Exception =>
      0
  finally
    println("done")

// Match expressions

@odersky what do you think?

EDIT: I still think it would be best to be able to have no delimiter at all after the last token of a class/object/package header. Whether it's technically easy to do or not I don't really know, but I think it's definitely worth discussing on a "would that be a nice syntax to have" even if it turns out to be hard to implement

package p

  object o

    class C extends Object
               with Serializable

      val x = new C
          def y = 3

      val result =
        if (x == x):
          println("yes")
          true
        else
          println("no")
          false
pkolaczk commented 7 years ago

Will I be able to write short expressions in a single line?

This looks weird: val x = if y: 0 else 1

Honestly, I prefer: val x = if y then 0 else 1

A nice thing about keywords like if .. then .. else or while .. do is that they allow to drop parens, even if the expression is formatted in a single line.

pkolaczk commented 7 years ago

Actually:

val x = if y then 0 else 1

looks to me even better than the current:

val x = if (y) 0 else 1

Here we have a mixture of syntax based on punctuation (parens) and keyword (else).

Sciss commented 7 years ago

After looking at Martin's original proposal again, and comparing with the colon proposal by Lihaoyi, I think it makes more sense to stick to the original idea. It looks good for all cases where with is not required: for, while, if then, try, match, def name =. So the problem IMO is only the lambdas and the class definitions.

pkolaczk commented 7 years ago

Is there really any good reason we need with in the original proposal?

Why is this a problem:

class C1
class C2
  def inc(x: Int): Int =   // this is indented, so the compiler would know we continue C2 
    x + 1
end C2 // optional
jvvasques commented 7 years ago

This is a really interesting topic and I'm very happy to see it debated with so many constructive arguments. I wanted to give my opinion as well.

I believe that removing { makes code more elegant and pleasant to read. However, I'm not the biggest fan of the python indentation approach. One of the few things I like about Ruby is the elegant and concise syntax. I think the Ruby guys and now Elixir by the hands of Jose Valim made some really interesting choices there.

May I suggest something like this.

Blocks

def f(x: Int) =
  val y = x * x
  y + 1
end

match expressions:

xs match do
  case x :: xs1 => ...
  case Nil => ...
end

delimit statement sequences in objects and classes. Example:

object Obj
  class C(x: Int)
      def f = x + 3
   end

   def apply(x: Int) = new C(x)
end

Passing arguments that were formerly in braces to functions. Examples:

xs.map do x =>
  x + 2
end

xs.collect do
  case P1 => E1
  case P2 => E2
end

Would love to get some feedback on this 🙏

danielyli commented 7 years ago

To assess the risks and benefits of an indentation-based language, it's helpful to look to the accumulated knowledge from real-world experiences of using them in a large organization. Probably very few companies have as large a Python codebase as Google.

The creators of Go have this to say about why their experience with Python at Google informed them that indentation-based code blocks were not as safe or dependable as C-style braces:

Some observers objected to Go's C-like block structure with braces, preferring the use of spaces for indentation, in the style of Python or Haskell. However, we have had extensive experience tracking down build and test failures caused by cross-language builds where a Python snippet embedded in another language, for instance through a SWIG invocation, is subtly and invisibly broken by a change in the indentation of the surrounding code. Our position is therefore that, although spaces for indentation is nice for small programs, it doesn't scale well, and the bigger and more heterogeneous the code base, the more trouble it can cause. It is better to forgo convenience for safety and dependability, so Go has brace-bounded blocks. -- Rob Pike, Google

(Source: https://talks.golang.org/2012/splash.article)

vanDonselaar commented 7 years ago

I appreciate the open attitude towards change and improvement of the language. However, I think significant whitespacing is a mistake. A few practical things that I've experienced while using the language that inspired this proposal (Python).

Apart from the practical problems, I still don't see what problem is solved here. I'm afraid it's only a matter of aesthetics.

olafurpg commented 7 years ago

I made an experiment to see how this change would impact code in the wild.

I implemented a naive scalafix rewrite to remove all curly braces as long as the open/close pair is not on the same line: https://github.com/scalacenter/scalafix/commit/a921c42d79f134011b3bb8ac62437d4d6c4dfee2 This rewrite does not handle the with syntax or validate that the indentation inside blocks/templates is consistently 2 spaces. This rewrite is only meant to give a rough feel for how the diff looks like.

I ran the rewrite on ~30k source files (~3 million lines of code) from the projects: PredictionIO akka breeze cats ensime-server fastparse finagle framework gitbucket goose intellij-scala kafka lila marathon pickling platform playframework saddle sbt scala scala-js scalafx scalafx-ensemble scalatra scalaz scalding scaloid shapeless slick spark spire summingbird util.

The diff from running the rewrite is here: https://github.com/olafurpg/scala-repos/pull/21

Some observations:

(foo
  andThen bar
  andThen baz)
// term application with block argument
(foo {
  println(1)
  qux
})
// without curly, is this supposed to work or not?
(foo
  println (1)
  qux
)
Jasper-M commented 7 years ago

Deeply nested code becomes hard to follow in my opinion, 4 space indent may help.

I have seen code in the scala compiler where 4 space indents would mean that some pieces of code are indented with up to 52 spaces.

dwijnand commented 7 years ago

Deeply nested code becomes hard to follow in my opinion, 4 space indent may help. Example Spray routes https://github.com/olafurpg/scala-repos/blob/1225fa10eb67934d7c15fec745ceefe43005533b/repos/PredictionIO/data/src/main/scala/io/prediction/data/api/EventServer.scala#L334-L385

In my opinion that deeply nested code is equally hard to read even when it had brackets (and a slightly longer line length):

https://github.com/apache/incubator-predictionio/blob/0fa51c29104f9f776ec26ee2d9a211cb304a1dd3/data/src/main/scala/org/apache/predictionio/data/api/EventServer.scala#L380-L442

lihaoyi commented 7 years ago

without curly braces, how do we distinguish between a term application with block arguments (like foo {\n blah\n}) and infix operators where the operator follows a line break

I think one possible solution is to change the convention to not-indent multi-line infix operators

foo andThen
bar andThen 
baz

Granted, this is a different convention from what I see now (where people tend to indent the follow on lines of a multi-line infix chain) but if we make this the way you write multi-line infix operators, then we can unambiguously say that if the follow-on lines are indented, they are part of a {} application

davidholiday commented 7 years ago

Please don't do this. Please please please don't do this. One of the things I really dislike about Python is that white space is meaningful. Yes there are plenty of reasonable arguments both in favor of and against - but honestly I think for the most part this comes down to personal preference. My strong personal preference is not to make white space meaningful because it introduces a set of challenges (not the least of which is refactoring the current Scala code base and re-educating the community in the 'new' way of doing things) that I find unpleasant to deal with. Moreover, and specifically to the argument of 'it's cleaner' -- I'd remind proponents of this position that in languages like Python that have meaningful white space, there's often the need for a continuation symbol like a slash for code that for whatever reason needs to be on multiple lines. If you're coding in a paradigm where that happens a lot (like creating a Spark application), you've replaced one set of 'ugly' symbols for another.

shawjef3 commented 7 years ago

Part of the reason many languages gain success is because there is one syntax. People argue about what it should be, but then a dictator or committee says 'this is how it is' and that's how it is. Allowing multiple syntaxes will cause confusion.

Scala is already criticized for being too flexible. This is going to make that criticism stronger with very little benefit.

shawjef3 commented 7 years ago

Or how about moving code from one place to another. Instead of braces telling your compiler and IDE the scope of what you pasted, you're going to have to manually clean fix it after pasting.

nafg commented 7 years ago

Seriously? What problem is this solving? I've seen people complain about lots of things in Scala. Curly braces aren't one of them (maybe except for once or twice a really long time ago). What number priority is this?

More to the point, what value does it add and what risk does it bring -- can you really put those on the two sides of a scale and say that it's worth doing?

People don't like flexible syntax. Where the flexibility is useful, they can be convinced. Otherwise, please don't add more than necessary. I thought that's why we were removing procedure syntax.

Also, as others have said, indentation-based syntax creates real problems for git diff, git merge, and copy-paste.

  1. If the nesting level changed, and also some formatting changes got introduced, forget about seeing "what really changed."
  2. If git auto-merges a block above some code, it can change the meaning of that code without it being noticed (rather than creating mismatched braces).
  3. If I copy-paste a snippet of code somewhere, I can't expect to reformat the code and fix the indentation, since the indentation is the primary source of meaning.

If you think it's a good idea, give some support to Scalite. See how many people start using it. Don't risk fracturing the primary scala community and alienating people for zero gain.

Or better yet, spend the time creating good documentation. The lack of good documentation is the main reason I can't get anyone I know to look seriously at Scala.

nafg commented 7 years ago

Another point, if most people were coming to scala from python or F#, perhaps it would help adoption. But most people that I know get their exposure to programming in languages like Javascript, Java, and C#. This will hurt adoption from those camps. Already people from those camps find Scala's syntax too foreign.

soronpo commented 7 years ago

I think this is an IDE feature and not a language feature. IDEs can support hiding the braces from view and working on user indentation to implement braces in the source code. But this must be just a viewing and user-interaction feature, and not change the source code.

pkolaczk commented 7 years ago

People don't like flexible syntax.

I agree, and this is why I really like this proposal. This proposal makes the syntax less flexible. Everybody agrees that identation of code is needed, so currently code is indented and has braces. There are many ways you can place braces as well as many ways you can indent code. The number of combinations is huge. You can also make both contradictory (and braces win then).

Having whitespace as the main (only?) way of structuring code removes one degree of freedom.

Currently a simple if can be written in at least the following ways:

if (x) a else b
if (x) a 
else b
if (x) {
  a 
} else { 
  b
}
if (x) {
  a 
} 
else { 
  b
}
if (x) 
  a 
else 
  b
if (x) 
{
  a 
}
else 
{
   b
}

After the proposal it reduces to maybe just 3 simple and clean possibilities:

if x then a else b
if x then a 
else b
if x then 
  a 
else 
  b

If the nesting level changed, and also some formatting changes got introduced, forget about seeing "what really changed."

The same problem happens with braces. If you nest a block of code, you typically modify indentation as well, and this makes simple diff useless. However IDEs are already good at showing what changed, despite indentation changes.

If git auto-merges a block above some code, it can change the meaning of that code without it being noticed (rather than creating mismatched braces).

Git matches at least one line before and after a patched block. Such thing cannot happen automatically.

megri commented 7 years ago

I'm not really in a position to say anything but I'm afraid suggestions like this can detract effort from more pressing matters. Part of me likes the idea but it feels like there's so many other things that has to get right, things like compilation speed, improved tooling, uniform equality, type system changes in Dotty and elimination of Scala puzzlers.

Perhaps a change like the one suggested here isn't a big investment in time-to-market but I fear that without proper management it at best adds another way to do things which in my humble opinion, at this time, isn't what Scala/the future needs. At worst, it can split the user base and make establishing code conventions a living nightmare, even with automatic rewrite tools.

nafg commented 7 years ago

@pkolaczk I don't think anyone is considering outlawing braces, just making them optional. So that would make a total of 9 ways.