Open kaby76 opened 5 months ago
In addition, NL and semi are cavalierly sprinkled throughout the grammar, which causes ambiguity and quite poor performance. There is no theoretical, consistent, thought-out manner of how it should be used, and where it should be used. For example, consider how propertyDeclaration is parsed.
Input:
var a = 1
var a = 2
var a = 3
var a = 4
var a = 5
var a = 6
var a = 7
This input causes large k lookahead because the parser requires full context to understand when to use the NL, whether in propertyDeclaration, or in topLevelObject. It is even wrong in the "spec" grammar implementation.
There is even a faux pas following the wrong "NL" use in the production. ["(getter? (NL semi? setter)? | setter? (NL* semi? getter)?)"](https://github.com/Kotlin/kotlin-spec/blob/4b29a8b42e08237f45c0c3c185eaae4bba3751f6/grammar/src/main/antlr/KotlinParser.g4#L186C28-L186C87) is an alt with both sides that can derive empty. A grammar should never offer the choice of empty vs empty!
$ (trperf y > out; cat out | head -1 > out2; cat out | tail -n +2 | sort -k6 -n -r | head > out3; cat out2 out3 | column -t)
Time to parse: 00:00:00.1550288
Decision Rule Invocations Time Total-k Max-k Fallback Ambiguities Errors Transitions
157 propertyDeclaration 14 0.331425 203 50 7 7 0 25
305 postfixUnaryExpression 7 0.064142 21 3 0 0 0 4
146 propertyDeclaration 7 0.029213 21 3 0 0 0 2
300 asExpression 7 0.056031 14 2 0 0 0 3
289 elvisExpression 7 0.055394 14 2 0 0 0 3
278 conjunction 7 0.058866 14 2 0 0 0 3
275 disjunction 7 0.057148 14 2 0 0 0 3
156 propertyDeclaration 7 0.071258 14 2 0 0 0 3
350 primaryExpression 7 0.000744 7 1 0 0 0 1
301 prefixUnaryExpression 7 0.002013 7 1 0 0 0 1
If you correct the NL's in propertyDeclaration and getter/setter, the max-k's are somewhat resolved.
$ diff KotlinParser.g4 ..
178,179c178
< ) (NL* typeConstraints)? (NL* ('=' NL* expression | propertyDelegate))?
< (
---
> ) (NL* typeConstraints)? (NL* ('=' NL* expression | propertyDelegate))? (NL+ ';')? NL* (
203,204c202,203
< : NL? modifiers? 'get'
< | NL? modifiers? 'get' NL* '(' NL* ')' (NL* ':' NL* type_)? NL* functionBody
---
> : modifiers? 'get'
> | modifiers? 'get' NL* '(' NL* ')' (NL* ':' NL* type_)? NL* functionBody
208,209c207,208
< : NL? modifiers? 'set'
< | NL? modifiers? 'set' NL* '(' (annotation | parameterModifier)* setterParameter ')' (
---
> : modifiers? 'set'
> | modifiers? 'set' NL* '(' (annotation | parameterModifier)* setterParameter ')' (
02/10-07:58:19 ~/issues/g4-3959/kotlin/kotlin-formal/Generated-CSharp
$ (trperf y > out; cat out | head -1 > out2; cat out | tail -n +2 | sort -k6 -n -r | head > out3; cat out2 out3 | column -t)
Time to parse: 00:00:00.1211182
Decision Rule Invocations Time Total-k Max-k Fallback Ambiguities Errors Transitions
306 postfixUnaryExpression 7 0.051186 21 3 0 0 0 4
146 propertyDeclaration 7 0.030375 21 3 0 0 0 2
301 asExpression 7 0.044945 14 2 0 0 0 3
290 elvisExpression 7 0.041122 14 2 0 0 0 3
279 conjunction 7 0.048455 14 2 0 0 0 3
276 disjunction 7 0.038296 14 2 0 0 0 3
163 propertyDeclaration 7 0.191566 21 2 7 7 0 15
158 propertyDeclaration 7 0.056232 14 2 0 0 0 3
155 propertyDeclaration 7 0.039291 14 2 0 0 0 3
502 semis 7 0.004448 7 1 0 0 0 2
02/10-07:59:24 ~/issues/g4-3959/kotlin/kotlin-formal/Generated-CSharp
I am trying to investigate https://github.com/antlr/antlr4-lab/issues/83.
There are two Kotlin grammars: kotlin and kotlin-formal. In addition, one can find the kotlin grammar in the Jetbrains repo (https://github.com/Kotlin/kotlin-spec/tree/4b29a8b42e08237f45c0c3c185eaae4bba3751f6/grammar/src/main/antlr).
The readmes in the two kotlin grammars in this repo don't explain why there are two, what the differences are, and don't explain which one to chose.
What version do either of these grammars intend to support? There is no version information of which release.
The link https://github.com/antlr/grammars-v4/blob/1bfcc5a6b954008e23bc5a982864364a069c8756/kotlin/kotlin-formal/README.md?plain=1#L7 in the kotlin-formal readme is dead.
kotlin-formal is tested. https://github.com/antlr/grammars-v4/blob/1bfcc5a6b954008e23bc5a982864364a069c8756/kotlin/kotlin-formal/desc.xml#L3
kotlin is not tested. https://github.com/antlr/grammars-v4/blob/1bfcc5a6b954008e23bc5a982864364a069c8756/kotlin/kotlin/desc.xml#L3