Closed steve-hickman-epistimis closed 1 year ago
Chevrotain has a fixed lookahead that defaults to three tokens, which is not sufficient in this example (you need four tokens to disambiguate). But we switched to a dynamic lookahead in Langium. @msujew the new lookahead should be able to handle the situation, right?
Which lookahead is chosen depends on the parser settings in langium-config.json
@steve-hickman-epistimis I'm not sure what version of Langium you're using, but version 1.0 should come with unrestricted lookahead by default. It should be able to handle any ANTLR3 grammar without any issues (and more). You might need to remove the chevrotainParserConfig#maxLookahead
property from your langium config.
As far as I know, I'm using the most current version (sorry for the delay in responding - I had Covid)
@steve-hickman-epistimis So you are using v1.2.0 ? Did you tried to do what @msujew suggested?
I just updated the langium generator to the current version (1.2). I reverted the change listed above and that problem does not appear.
On Chevrotain config: Honestly, I'm so rusty on npm that I don't know where the langium config is - so I don't know how to change the lookahead.
One other note: While I do not get the error above now, I do get this:
src/language-server/universal-data-definition-language.langium:511:36 - Keywords cannot only consist of whitespace characters.
The complaint is about:
hidden terminal WS returns string:(' ' | '\t' | '\r' | '\n' )+;
I would ignore this except I also get:
Langium generator failed.
Seems like this should be a configuration issue as well?
@steve-hickman-epistimis I think the best would be to ask further Langium related question in the Langium repository so that others can find your issue and find the solution :)
I tend to close this issue as not Xtext2Langium related.
It appears that this is caused by xtext2langium reading the Terminals.xtext file and incorrectly converting the content. It generated
hidden terminal ML_COMMENT returns string:'/*' -> '*/' ;
hidden terminal SL_COMMENT returns string:'//' !('\n' | '\r' )('\r'? '\n' )? ;
hidden terminal WS returns string:(' ' | '\t' | '\r' | '\n' )+;
The last line being the problem.
I think the simple solution is that xtext2langium should generate
hidden terminal WS: /\s+/;
for WS
instead.
Given that some devs might change the definition of their WS
token, just generating \s+
might be a potentially destructive operation. We should probably just allow empty keywords in terminals.
@steve-hickman-epistimis
Regarding maxLookahead
config, it can be changed in langium-config.json
file inside the chevrotainParserConfig
section. See this example.
Empty keywords in terminals like hidden terminal WS returns string:(' ' | '\t' | '\r' | '\n' )+;
are now allowed in Langium. So I'm closing the issue.
Thanks for reporting!
The following set of XText grammar rules will be processed correctly by ANTLR3 without ambiguity:
From these rules, xtext2Langium generates the following:
Running
npm run langium:generate
on the generated Langium grammar results in the following errors:I wasn't sure if this is a problem with xtext2langium (not doing a proper conversion) or a problem with Chevrotrain. it seems like the generated .langium is syntactically correct, so xtext2langium should be OK.
Is this is a weakness in Chevrotrain? Or something else?
I was able to fix this by changing:
to
(Replacing the outer containing parentheses with curly braces).