Open yososs opened 2 years ago
Then probably this:
unconditionalAssignableSelector
: '[' expression ']'
| '.' identifier
;
should become:
unconditionalAssignableSelector
: '[' expression ']'
| '.' identifier
| '.' 'operator'
;
This grammar is old. The newest grammar, maintained by Erik Ernst, is here, and appears to have fixed this issue. I recently ported the grammar here to "target-agnostic format" in response to an antlr-discussions question. I will update the grammar today.
Thanks for sharing. I will check the operation tomorrow.
I ran the following unit test code. It works well, but I found that there are still a few problems.
public class DartParserTest {
// see: https://dart.dev/guides/language/language-tour#keywords
@Test
public void testKeywords0() {
String[] keywords0 = { "assert", "break", "case", "catch", "class", "const", "continue", "default", "do", "else",
"enum", "extends", "false", "final", "finally", "for", "if", "in", "is",
"new",
"null", "rethrow",
"return", "super", "switch", "this", "throw", "true", "try", "var", "void", "while", "with" };
for (String k : keywords0) {
String content = "class A{\n"
+ " bool isPlusOrMinus(Expression expression) {\n"
+ " if (expression."+k+" == '+') return true;\n"
+ " if (expression."+k+" == '-') return true;\n"
+ " return false;\n"
+ " }\n"
+ "}\n";
// System.out.println(content);
final CodePointCharStream cstream = CharStreams.fromString(content);
final DartLexer lexer = new DartLexer(cstream);
final CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
DartParser parser = new DartParser(stream);
boolean[] syntaxErr = new boolean[1];
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
RecognitionException arg5) {
syntaxErr[0] = true;
}
});
LibraryDefinitionContext root = parser.libraryDefinition();
Assert.assertTrue("error in "+k, syntaxErr[0]);
}
}
@Test
public void testKeywords1() {
String[] keywords1 = {"show", "async", "sync", "on", "hide"};
for (String k : keywords1) {
String content = "class A{\n"
+ " bool isPlusOrMinus(Expression expression) {\n"
+ " if (expression."+k+" == '+') return true;\n"
+ " if (expression."+k+" == '-') return true;\n"
+ " return false;\n"
+ " }\n"
+ "}\n";
// System.out.println(content);
final CodePointCharStream cstream = CharStreams.fromString(content);
final DartLexer lexer = new DartLexer(cstream);
final CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
DartParser parser = new DartParser(stream);
boolean[] syntaxErr = new boolean[1];
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
RecognitionException arg5) {
syntaxErr[0] = true;
}
});
LibraryDefinitionContext root = parser.libraryDefinition();
Assert.assertFalse("error in "+k, syntaxErr[0]);
}
}
@Test
public void testKeywords2() {
String[] keywords2 = { "abstract", "as", "covariant", "deferred", "dynamic", "export", "extension", "external",
"factory", "Function", "get", "implements", "import", "interface", "late", "library", "mixin",
"operator", "part", "required", "set", "static", "typedef" };
for (String k : keywords2) {
String content = "class A{\n"
+ " bool isPlusOrMinus(Expression expression) {\n"
+ " if (expression."+k+" == '+') return true;\n"
+ " if (expression."+k+" == '-') return true;\n"
+ " return false;\n"
+ " }\n"
+ "}\n";
// System.out.println(content);
final CodePointCharStream cstream = CharStreams.fromString(content);
final DartLexer lexer = new DartLexer(cstream);
final CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
DartParser parser = new DartParser(stream);
boolean[] syntaxErr = new boolean[1];
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
RecognitionException arg5) {
syntaxErr[0] = true;
}
});
LibraryDefinitionContext root = parser.libraryDefinition();
Assert.assertFalse("error in "+k, syntaxErr[0]);
}
}
@Test
public void testKeywords3_async() {
String[] keywords3 = {"await", "yield"};
for (String k : keywords3) {
String content = "class A{\n"
+ " bool isPlusOrMinus(Expression expression) async {\n"
+ " if (expression."+k+" == '+') return true;\n"
+ " if (expression."+k+" == '-') return true;\n"
+ " return false;\n"
+ " }\n"
+ "}\n";
// System.out.println(content);
final CodePointCharStream cstream = CharStreams.fromString(content);
final DartLexer lexer = new DartLexer(cstream);
final CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
DartParser parser = new DartParser(stream);
boolean[] syntaxErr = new boolean[1];
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
RecognitionException arg5) {
syntaxErr[0] = true;
}
});
LibraryDefinitionContext root = parser.libraryDefinition();
Assert.assertTrue("error in "+k, syntaxErr[0]);
}
}
@Test
public void testKeywords3() {
String[] keywords3 = {"await", "yield"};
for (String k : keywords3) {
String content = "class A{\n"
+ " bool isPlusOrMinus(Expression expression) {\n"
+ " if (expression."+k+" == '+') return true;\n"
+ " if (expression."+k+" == '-') return true;\n"
+ " return false;\n"
+ " }\n"
+ "}\n";
// System.out.println(content);
final CodePointCharStream cstream = CharStreams.fromString(content);
final DartLexer lexer = new DartLexer(cstream);
final CommonTokenStream stream = new CommonTokenStream(lexer);
stream.fill();
DartParser parser = new DartParser(stream);
boolean[] syntaxErr = new boolean[1];
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
RecognitionException arg5) {
syntaxErr[0] = true;
}
});
LibraryDefinitionContext root = parser.libraryDefinition();
Assert.assertFalse("error in "+k, syntaxErr[0]);
}
}
}
A little bit of status...
I've been updating the scraper and have a grammar that works only "so so"--but better than the other available grammars. See https://github.com/kaby76/ScrapeDartSpec/blob/master/scraped.g4
I've written a small thread describing how this compares with the current "dart2/" grammar and the "reference grammar" that was written by the Dart Language Team. https://twitter.com/KenDomino/status/1533053623554428929
There is still a lot of work to do.
Do you have a comparison to the antlr4 grammar found in the dart sdk?
Yes. I ran sdk sources through the Dart grammar written by the Dart Language Team. The results are here. It didn't do as well as the scraped grammar.
Comparison results are good. I will actually use it too.
I ran the same test using scraped.g4.
The test for testKeywords0 now passes, but the test for testKeywords3 fails.
The Dart language seems to have a complicated syntax due to the special specification of keywords.
The Spec does not define rules for dynamic types. https://github.com/dart-lang/language/issues/2276. After adding in 'dynamic' as a type, the grammar accepts 78% of the Dart sdk. Much much better.
https://github.com/dart-lang/language/issues/2279
Now 94% of the sdk passing.
Another problem with the Spec, https://github.com/dart-lang/language/issues/2282, occurs with "abstract" modifiers on fields. I have a workaround, but it's a terrible hack (the old rule was this; it is now this). The grammar in the Spec doesn't even corresponding directly to the hand-written parser in the Dart compiler.
Now 95% of the sdk passing.
Status: I have a new grammar that passes 369 out of 372 Dart source files in the sdk. I think I'll stop here. I plan on using this as a bootstrap grammar to parse the Dart compiler and scrape the grammar directly from the sources. Although the quality of the grammar that the Dart team provides is very good, the fact that it's two years behind the source code means that it'll be always out of date. It's a similar situation for other languages. Scraping the source of the compiler is the only real solution.
Status
The good news: I have a new Dart2 grammar that parses 100% of the Dart2 SDK.
The grammar requires two semantic predicates in the lexer. Since I want this to work across targets, I've been working to write the grammar in "target agnostic format".
However, the split parser for C# is not working. I have done many dozens of these conversions to "target agnostic format", for all but one of the targets, so I am confident that I am doing it correctly. While the lexer tokens are the same, the parser operates differently between split vs combine.
Therefore, it is likely that I've stumbled on a bug in the parser runtime for C#. I am looking into the problem.
The error occurs in both C# and Java for a split grammar, but not for the combined grammar for either target. This is bad. It means there is a problem across targets for split grammars--unless the combined grammar code was supposed to produce a parse error.
The problem was with string literals. I defined rules that should not have been there. https://github.com/antlr/grammars-v4/pull/2654 fixes #2597.
Checking the Dart2 specification, 'operator' can be used as a field name. Probably the same problem will occur with other marked keywords.
Keyword Specifications for Dart2
Reproduced code