dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.23k stars 1.57k forks source link

[grammar] comma-related Inconsistency between the spec and implementation. #54538

Open modulovalue opened 10 months ago

modulovalue commented 10 months ago

Consider:

void main() {
  for (;; print("foo"), ) {
    break;
  }
}

Notice the trailing comma, which, according to the spec, doesn't appear to be valid Dart syntax. However, that program is accepted as a valid dart program by the implementation.

The analyzer and parsers based on Dart.g produce the following parse trees:

 === Analyzer ===
Scan errors: 0
Parse errors: 0
<CompilationUnitImpl> [0-59]
┗━ <FunctionDeclarationImpl> [0-58]
  ┣━ <NamedTypeImpl> [0-4]
  ┃  ┗━ 'void' [0-4]
  ┣━ 'main' [5-9]
  ┗━ <FunctionExpressionImpl> [9-58]
    ┣━ <FormalParameterListImpl> [9-11]
    ┃  ┣━ '(' [9-10]
    ┃  ┗━ ')' [10-11]
    ┗━ <BlockFunctionBodyImpl> [12-58]
      ┗━ <BlockImpl> [12-58]
        ┣━ '{' [12-13]
        ┣━ <ForStatementImpl> [16-56]
        ┃  ┣━ 'for' [16-19]
        ┃  ┣━ '(' [20-21]
        ┃  ┣━ <ForPartsWithExpressionImpl> [21-36]
        ┃  ┃  ┣━ ';' [21-22]
        ┃  ┃  ┣━ ';' [22-23]
        ┃  ┃  ┗━ <MethodInvocationImpl> [24-36]
        ┃  ┃    ┣━ <SimpleIdentifierImpl> [24-29]
        ┃  ┃    ┃  ┗━ 'print' [24-29]
        ┃  ┃    ┗━ <ArgumentListImpl> [29-36]
        ┃  ┃      ┣━ '(' [29-30]
        ┃  ┃      ┣━ <SimpleStringLiteralImpl> [30-35]
        ┃  ┃      ┃  ┗━ '"foo"' [30-35]
        ┃  ┃      ┗━ ')' [35-36]
        ┃  ┣━ ')' [38-39]
        ┃  ┗━ <BlockImpl> [40-56]
        ┃    ┣━ '{' [40-41]
        ┃    ┣━ <BreakStatementImpl> [46-52]
        ┃    ┃  ┣━ 'break' [46-51]
        ┃    ┃  ┗━ ';' [51-52]
        ┃    ┗━ '}' [55-56]
        ┗━ '}' [57-58]
--------------------------------------------------------------------------------
 === ANTLR ===
line 2:24 extraneous input ')' expecting {'(', '[', '~', '<', '-', '!', '++', '--', '#', 'const', 'false', 'new', 'null', 'super', 'switch', 'this', 'throw', 'true', 'abstract', 'as', 'covariant', 'deferred', 'dynamic', 'export', 'extension', 'external', 'factory', 'Function', 'get', 'implements', 'import', 'interface', 'late', 'library', 'operator', 'mixin', 'part', 'required', 'set', 'static', 'typedef', 'await', 'yield', 'async', 'base', 'hide', 'of', 'on', 'sealed', 'show', 'sync', 'when', NUMBER, HEX_NUMBER, RAW_SINGLE_LINE_STRING, RAW_MULTI_LINE_STRING, SINGLE_LINE_STRING_SQ_BEGIN_END, SINGLE_LINE_STRING_SQ_BEGIN_MID, SINGLE_LINE_STRING_DQ_BEGIN_END, SINGLE_LINE_STRING_DQ_BEGIN_MID, MULTI_LINE_STRING_SQ_BEGIN_END, MULTI_LINE_STRING_SQ_BEGIN_MID, MULTI_LINE_STRING_DQ_BEGIN_END, MULTI_LINE_STRING_DQ_BEGIN_MID, '{', IDENTIFIER}
line 2:26 missing ')' at '{'
Errors of type 1: [[@16,46:50='break',<53>,3:4] Bad state: ]
Errors of type 2: [), <missing ')'>]
line 2:24 extraneous input ')' expecting {'(', '[', '~', '<', '-', '!', '++', '--', '#', 'const', 'false', 'new', 'null', 'super', 'switch', 'this', 'throw', 'true', 'abstract', 'as', 'covariant', 'deferred', 'dynamic', 'export', 'extension', 'external', 'factory', 'Function', 'get', 'implements', 'import', 'interface', 'late', 'library', 'operator', 'mixin', 'part', 'required', 'set', 'static', 'typedef', 'await', 'yield', 'async', 'base', 'hide', 'of', 'on', 'sealed', 'show', 'sync', 'when', NUMBER, HEX_NUMBER, RAW_SINGLE_LINE_STRING, RAW_MULTI_LINE_STRING, SINGLE_LINE_STRING_SQ_BEGIN_END, SINGLE_LINE_STRING_SQ_BEGIN_MID, SINGLE_LINE_STRING_DQ_BEGIN_END, SINGLE_LINE_STRING_DQ_BEGIN_MID, MULTI_LINE_STRING_SQ_BEGIN_END, MULTI_LINE_STRING_SQ_BEGIN_MID, MULTI_LINE_STRING_DQ_BEGIN_END, MULTI_LINE_STRING_DQ_BEGIN_MID, '{', IDENTIFIER}
line 2:26 missing ')' at '{'
<startSymbol>
┗━ <libraryDefinition>
  ┣━ <metadata>
  ┣━ <topLevelDefinition>
  ┃  ┣━ <functionSignature>
  ┃  ┃  ┣━ <type>
  ┃  ┃  ┃  ┗━ <typeNotFunction>
  ┃  ┃  ┃    ┗━ 'void'
  ┃  ┃  ┣━ <identifier>
  ┃  ┃  ┃  ┗━ 'main'
  ┃  ┃  ┗━ <formalParameterPart>
  ┃  ┃    ┗━ <formalParameterList>
  ┃  ┃      ┣━ '('
  ┃  ┃      ┗━ ')'
  ┃  ┗━ <functionBody>
  ┃    ┗━ <block>
  ┃      ┣━ '{'
  ┃      ┣━ <statements>
  ┃      ┃  ┗━ <statement>
  ┃      ┃    ┗━ <nonLabelledStatement>
  ┃      ┃      ┗━ <forStatement>
  ┃      ┃        ┣━ 'for'
  ┃      ┃        ┣━ '('
  ┃      ┃        ┣━ <forLoopParts>
  ┃      ┃        ┃  ┣━ <forInitializerStatement>
  ┃      ┃        ┃  ┃  ┗━ ';'
  ┃      ┃        ┃  ┣━ ';'
  ┃      ┃        ┃  ┗━ <expressionList>
  ┃      ┃        ┃    ┣━ <expression>
  ┃      ┃        ┃    ┃  ┗━ <conditionalExpression>
  ┃      ┃        ┃    ┃    ┗━ <ifNullExpression>
  ┃      ┃        ┃    ┃      ┗━ <logicalOrExpression>
  ┃      ┃        ┃    ┃        ┗━ <logicalAndExpression>
  ┃      ┃        ┃    ┃          ┗━ <equalityExpression>
  ┃      ┃        ┃    ┃            ┗━ <relationalExpression>
  ┃      ┃        ┃    ┃              ┗━ <bitwiseOrExpression>
  ┃      ┃        ┃    ┃                ┗━ <bitwiseXorExpression>
  ┃      ┃        ┃    ┃                  ┗━ <bitwiseAndExpression>
  ┃      ┃        ┃    ┃                    ┗━ <shiftExpression>
  ┃      ┃        ┃    ┃                      ┗━ <additiveExpression>
  ┃      ┃        ┃    ┃                        ┗━ <multiplicativeExpression>
  ┃      ┃        ┃    ┃                          ┗━ <unaryExpression>
  ┃      ┃        ┃    ┃                            ┗━ <postfixExpression>
  ┃      ┃        ┃    ┃                              ┣━ <primary>
  ┃      ┃        ┃    ┃                              ┃  ┗━ <identifier>
  ┃      ┃        ┃    ┃                              ┃    ┗━ 'print'
  ┃      ┃        ┃    ┃                              ┗━ <selector>
  ┃      ┃        ┃    ┃                                ┗━ <argumentPart>
  ┃      ┃        ┃    ┃                                  ┗━ <arguments>
  ┃      ┃        ┃    ┃                                    ┣━ '('
  ┃      ┃        ┃    ┃                                    ┣━ <argumentList>
  ┃      ┃        ┃    ┃                                    ┃  ┗━ <argument>
  ┃      ┃        ┃    ┃                                    ┃    ┗━ <expression>
  ┃      ┃        ┃    ┃                                    ┃      ┗━ <conditionalExpression>
  ┃      ┃        ┃    ┃                                    ┃        ┗━ <ifNullExpression>
  ┃      ┃        ┃    ┃                                    ┃          ┗━ <logicalOrExpression>
  ┃      ┃        ┃    ┃                                    ┃            ┗━ <logicalAndExpression>
  ┃      ┃        ┃    ┃                                    ┃              ┗━ <equalityExpression>
  ┃      ┃        ┃    ┃                                    ┃                ┗━ <relationalExpression>
  ┃      ┃        ┃    ┃                                    ┃                  ┗━ <bitwiseOrExpression>
  ┃      ┃        ┃    ┃                                    ┃                    ┗━ <bitwiseXorExpression>
  ┃      ┃        ┃    ┃                                    ┃                      ┗━ <bitwiseAndExpression>
  ┃      ┃        ┃    ┃                                    ┃                        ┗━ <shiftExpression>
  ┃      ┃        ┃    ┃                                    ┃                          ┗━ <additiveExpression>
  ┃      ┃        ┃    ┃                                    ┃                            ┗━ <multiplicativeExpression>
  ┃      ┃        ┃    ┃                                    ┃                              ┗━ <unaryExpression>
  ┃      ┃        ┃    ┃                                    ┃                                ┗━ <postfixExpression>
  ┃      ┃        ┃    ┃                                    ┃                                  ┗━ <primary>
  ┃      ┃        ┃    ┃                                    ┃                                    ┗━ <literal>
  ┃      ┃        ┃    ┃                                    ┃                                      ┗━ <stringLiteral>
  ┃      ┃        ┃    ┃                                    ┃                                        ┗━ <singleLineString>
  ┃      ┃        ┃    ┃                                    ┃                                          ┗━ '"foo"'
  ┃      ┃        ┃    ┃                                    ┗━ ')'
  ┃      ┃        ┃    ┣━ ','
  ┃      ┃        ┃    ┗━ <expression>
  ┃      ┃        ┃      ┗━ ')'
  ┃      ┃        ┣━ '<missing ')'>'
  ┃      ┃        ┗━ <statement>
  ┃      ┃          ┗━ <nonLabelledStatement>
  ┃      ┃            ┗━ <block>
  ┃      ┃              ┣━ '{'
  ┃      ┃              ┣━ <statements>
  ┃      ┃              ┃  ┗━ <statement>
  ┃      ┃              ┃    ┗━ <nonLabelledStatement>
  ┃      ┃              ┃      ┗━ <breakStatement>
  ┃      ┃              ┃        ┣━ 'break'
  ┃      ┃              ┃        ┗━ ';'
  ┃      ┃              ┗━ '}'
  ┃      ┗━ '}'
  ┗━ '<EOF>'

Notice how the ANTLR based parser rejects that program, but the analyzer does not.

lrhn commented 10 months ago

Seems like the easiest fix is to change the spec. We generally allow trailing commas on places like this.

eernstg commented 10 months ago

Right, I don't think it's reasonable to introduce a breaking change in order to remove support for trailing commas in in particular situation where it is actually already implemented.

One thing, though, does dart format know about these trailing commas? I would have expected the following to have more line breaks, but it is actually formatted as shown here:

void main() {
  for (int i; i < 10; print("foo"), ++i, print("bar"),) {
    break;
  }
}

@munificent, what do you think about trailing commas in <forLoopParts>?

munificent commented 10 months ago

Wow, I'm a little surprised the formatter doesn't end up inadvertently dropping the trailing comma on the floor.

Filed this to track how we want to handle it: https://github.com/dart-lang/dart_style/issues/1354

As far as the language is concerned, it's hard to say. The language doesn't seem to have any consistent principle for where trailing commas are allowed:

I guess for loop updaters are most similar to enum values where there is a closing delimiter, but the stuff inside the delimiter is more than just the comma-separated list of things? In an enum, there may be ; followed by members. In a for loop, there are the preceding for parts clauses.

lrhn commented 10 months ago

For enums, I'd say there is a trailing delimiter, which is either } or ;.

I agree that the leading and trailing delimiters are not necessarily matched, and for loop increments are like that too, started by ; and ended by ).

Inside that, for-loop increments are more like function arguments, they are comma separated completely general expressions.

Currently implements also has a recognizable end, terminated by either { or ; (mixin application classes). I'd avoid exploiting that, since it's really just an accident that nothing else can come after.

modulovalue commented 10 months ago

I hope this issue isn't viewed as some evidence that there's "people who depend on this feature". I've discovered this one by accident (it seems that @lrhn discovered it first: https://github.com/dart-lang/language/issues/2430#issuecomment-1225458685!) and I don't plan on ever using it. I wouldn't mind a breaking change that removed this feature to prevent any potential formatter issues.


@munificent wrote

As far as the language is concerned, it's hard to say. The language doesn't seem to have any consistent principle for where trailing commas are allowed:

  • Has closing delimiter and trailing comma is allowed: collection literal, argument list, parameter list, assert statement arguments.
  • Has closing delimiter but disallows trailing comma: type parameter list, type argument list.

https://github.com/dart-lang/language/issues/2430