eclipse-langium / langium

Next-gen language engineering / DSL framework
https://langium.org/
MIT License
665 stars 61 forks source link

Data type using one terminal behaves differently from that terminal #1358

Closed msujew closed 4 months ago

msujew commented 5 months ago

Discussed in https://github.com/eclipse-langium/langium/discussions/1357

Originally posted by **drhagen** January 25, 2024 If I have this grammar ```langium grammar HelloWorld entry Model: ((persons+=Person) EOL)*; Person: 'person' name=ID; hidden terminal WS: /[ \t]+/; terminal ID: /[_a-zA-Z][\w_]*/; terminal NEWLINE: '\n'; EOL returns string: NEWLINE; ``` and try to parse this file ``` person John person Jane ``` I get an error "Expecting end of file but found 'person'". ![image](https://github.com/eclipse-langium/langium/assets/2952478/2c78c6d1-7274-410e-814d-ee424d5888aa) If I replace `EOL` with `NEWLINE` and delete `EOL` entirely ```langium grammar HelloWorld entry Model: ((persons+=Person) NEWLINE)*; Person: 'person' name=ID; hidden terminal WS: /[ \t]+/; terminal ID: /[_a-zA-Z][\w_]*/; terminal NEWLINE: '\n'; ``` the file parses successfully. What am I missing something about Langium because I would have expected those to be equivalent?
msujew commented 5 months ago

Seems like we only ever assume that data type rules call other data type rules in an unassigned way, see: https://github.com/eclipse-langium/langium/blob/0239db7b353bf06f5c0710e40bd264fe17fccb55/packages/langium/src/parser/langium-parser.ts#L236-L251

The assumption breaks as soon as we call the data type rule from a normal parser rule without an assignment. I assume the solution would be to just ignore the result and store the CST created in the current AST node.