qwertie / ecsharp

Home of LoycCore, the LES language of Loyc trees, the Enhanced C# parser, the LeMP macro preprocessor, and the LLLPG parser generator.
http://ecsharp.net
Other
172 stars 25 forks source link

LES3: Compact lists and attributes #105

Closed qwertie closed 4 years ago

qwertie commented 4 years ago

I'm adding support in the parser for "compact" expressions, i.e. expressions that do not contain spaces at the top level. This can be used for two things:

  1. Attributes. Previously if you wrote @foo(x) statement(), the attribute foo(x) was parsed by a special rule which could handle parentheses but little else (e.g. @foo.x could not be parsed.) The new code can handle arbitrary expressions that do not contain spaces, so @foo.x statement() would be valid but also @x=y statement() and @" Hello, "~" world" statement(). In the last example, notice that spaces in strings are not a problem.
  2. Argument lists, including tuples and JSON-like arrays. To use compact mode for an argument list, Begin the list with . (dot-space). For example, [. a b c d] means [a, b, c, d] and Foo(. x y 2+1i a>0&&b>0) means Foo(x, y, 2 + 1i, a > 0 && b > 0). In argument lists, commas are allowed ([. a b, c d] still means [a, b, c, d]).

In compact lists, semicolons are stored as special identifiers: [a b; c d] means [a, b, `';`, c, d]. I was interested in supporting the Julia-style syntax

[a b c
 d e f]

where a newline is equivalent to a semicolon, but I decided it was too complex. Essentially, the lexer causes newlines to be filtered out inside parentheses and square brackets, and if this is not done it makes otherwise-normal expressions with newlines such as

foo(x
    , y
    ++
    )

difficult to parse. Therefore, semicolons are required to separate rows in compact mode.

Note: Compact mode is not allowed in braced blocks (e.g. {. 1 2 3} is not legal). Partly this is for the sake of the printer, because the %newline trivia will be hard to deal with if it also has a semantic meaning like "semicolon", which it seems like it probably should if compact mode were allowed in this context:

{.
   a b c
   ; d e f;
   // Er... what should the grammar be and how should the newlines affect the Loyc tree?
   g h i // and how will we print it back out?
}

Another reason to be cautious is that dot-indents are still supported and I'd like to decrease the chance that a dot-indent could be confused with a compact-mode marker:

{
.  // This is a dot-indent. At the beginning of a line, it is treated the same as whitespace.
.  // This is useful for pasting code on forums that don't preserve runs of spaces.
.  .if x < 0 {
.  .  .throw Error
.  }
}
qwertie commented 4 years ago

Change merged today for v28.0. Compact expressions are not supported in the printer.