Closed delzac closed 5 years ago
The problem is that the letter i
can either be parsed as LCASE_LETTER
and ROMAN_NUMERALS
. This means that the first i
is actual parsed as a ROMAN_NUMBERALS
-Token, making it invalid to parse them the other way around. There are two ways to fix this:
?
in the qns_alphabet
Make LCASE_LETTER
have a higher priority than ROMAN_NUMERALS
:
LCASE_LETTER.2: "a".."z"
Thanks for the reply @MegaIng!
Can you also explain why is it that "10(i)(i)"
doesn't throw an exception? Doesn't the same conflict apply in terms of i
can either be parsed as LCASE_LETTER
and ROMAN_NUMERALS
?
The reason this happens, is because both rules can be empty, which causes the lexer to always jump over one of them in order to match the terminal with the higher priority.
With one rule empty and the second one matched, the parser expects an EOF, not more input. The introduction of (
forces the rule to not be empty.
So, changing the priority on LCASE_LETTER
won't help. But not allowing the first rule to be empty will.
The Earley algorithm will know how to resolve this ambiguity automatically.
Thank you for taking the time to share @erezsh! Really appreciate your input :))
i'm struggling for hours over why an exception is being thrown. If anyone can provide some guidance i'll be really grateful! :)
I'm using the follow grammar in lark-parser to parse alphabets and roman numerals. The grammar is as follows:
When i use this rule and parse the follow text, i get an exception:
For the life of me, i can't think of why this throws an exception. But this is fine:
result = Lark(grammar, parser='lalr').parse("10(i)(i)") # no error