TypeFox / chevrotain-allstar

Plugin module for the ALL(*) lookahead algorithm in Chevrotain
MIT License
10 stars 0 forks source link

Parser is not taking outer context into account #1

Open daumantas-kavolis-sensmetry opened 1 year ago

daumantas-kavolis-sensmetry commented 1 year ago

Langium version: 1.1

Steps To Reproduce

  1. Have a repeating token right after a rule with said token parsed with cardinality >1
  2. Parser treats the token outside the rule as part of it.

Encountered by using :: as a scope separator for references, parsing the delimited reference in parts and having an optional group starting with :: right after said reference. The optional group is never parsed and the reference CST text ends with :: instead of an ID. Curiously, combining the two tokens into one parses the group and the rule correctly though the option of having spaces in between is lost.

Link to code example:

https://langium.org/playground?grammar=OYJwhgthYgBAIge2gSwHYFlEBMCmAbAKENzQBcQBPBZMdCHAgLkNjdgAoDcJSyBnANQBeAIIAjfhTABjMgFF8PPgEoAVAG5iEqeDmLl5Fu1gAFWQGswwXPFwz8MMGRSI0sAD6wAKpQAOuFqE5jJWNnYOTi5uxuwA5H6W1rhxsGiQuMIAkvCwcQDecawm7FxKvORCYpLS%2BuV8xSWesCgQfoggAiJZbR1kjSXqA3EAvnFBPe2dsWxxrVNkqWQwNmTC8uQoZJQASrgAZpxxTEypIPYAriD8KABuuAD8wnFqanEqD3ka48S%2BATMIZxgP64ZobFzbILwIEggFxbBA7YBVLpXjZeBBcFbShwvjYlEZdFHXAADzIpGw-FS-AuARAIPWm22e32HzyhQGJg4%2B1wziuuCqADFeWR%2BUMTKMfoRhXzzgCONA0JQnnFFZR3p9UZkcnlTrAkZkQSygvtwMAKmRYCzcOc0DJcADEp0qgBtHS1BT1Iw5AC6RxOqSdXWEbpqek9hjITF96iCRoOAOttvtWixzITAyTpBTxAAFihsHh3OSQBB0GB8LAAOoAZSYsAA9AAdIQNrQlsvpSs5esNl0AfTAAFoAF6iIcALR9LqbAHd%2Bz61G28wWi-qbZ2K7AMAAZfsAYQA8hgMPIAHLeXtN5tqGf8Js1xcPJtqa-L-OF0jr0vlys1vdHie56Xo217Xi6AB6TZoE2ICLm2QA&content=A4Qwxg1iDmCmAEAFeBvAUPT8CWBbYA9gE4Au8AggFyUBC1AVPQNxoC%2BQA

The current behavior

Slightly modified domain model example though it fails to parse the trailing ::**. LL(*) displays no apparent parsing errors in a similar grammar but fails to parse the optional group and CST range/text is very clearly wrong, terminated early at the last separator :: instead of ;.

The expected behavior

The trailing optional group is parsed and CST terminates at ;.

msujew commented 1 year ago

I've transferred this issue to the chevrotain-allstar repo since its the underlying lookahead implementation that's at fault here. The current algorithm doesn't take outer context into account. It's not trivial to implement, so it might take a while until this will be resolved.

daumantas-kavolis-sensmetry commented 1 year ago

The current behaviour is also very misleading - the parser just fails silently. It should either parse the full group, parse none of it if it cannot be parsed fully, or emit a parser error, none of which currently happen. Would the parser need to take into account the outer context if it could detect that it cannot parse an optional group and simply not try to parse it?