Closed DenizBasgoren closed 3 weeks ago
Can you cut this example down to be smaller/less complicated in a form that still shows off the issue, please?
Alright, but just by judging from the results, if a fails and b matches, then a / b should match as well, but what I observe is the opposite...
Oh, never mind. I see your issue.
line =
_ a:inst? _ b:comment?
This rule matches the empty string.
yes, an empty string is a correct match for a "line", but that's not the issue here. I provided a specific input above
That rule always matches. The second half of the /
will never fire because of that.
Removing the /
in start
might or might not fix your issue, depending on what behavior you want.
I'd probably go with something like:
start = blankLine / instLine / wrongLine
blankLine = _ b:comment?
instLine = _ a:inst _ b:comment?
Sorry, that has the same problem. Like this then:
blankLine = _ comment? &EOL
EOL = '\n' / EOF
EOF = !.
You could probably keep your current approach and just add that &EOL
to your existing line
rule.
Alright, that might be a solution, but the lack of logic here still bugs me. Please try the input []=[]++[]
and see for yourself:
start = line here line matches 0 characters, so the overall match fails. that's expected.
start = wrongLine this matches the entire thing, and the overall match succeeds. that's expected as well. the entire string is consumed.
start = line / wrongLine now simply by logic, a / b should match if either of a, b matches. in this case, wrongLine matches, so this should match as well. But what I'm observing is, this fails. Can you explain please what might be the mistake here
The reason because of implicit EOF rule that checks that there is not more input that you should mentally add to all your examples:
start = line EOF; // `line` matched "", `EOF` failed
start = wrongLine EOF; // `wrongLine` matched all possible input, `EOF` matched no (left) input
start = (line / wrongLine) EOF; // !!!! `line` matched "", `EOF` failed
As you can see, the combination of line
and wrongLine
is not just manual parse of one rule and then another. The start = line
failed exactly because of hidden EOF rule which should guarantee that all input was analyzed. The line
rule itself is matched as @hildjj already pointed out
start = line here line matches 0 characters, so the overall match fails. that's expected.
Not quite. line
matches 0 characters, so the overall match succeeds.
More clarity: the line
rule matches. The parser fails because of what mingun said; when there is input left over after the startRule finishes, parsing fails. This is equivalent to there being an implicit EOF rule at the end of your startRule.
@Mingun 's line of reasoning seems consistent. I think this edge case deserves it's own section or at least a mention in the documentation (if it hasn't already). Thanks @Mingun @hildjj
My solution is to change
line =
_ a:inst? _ b:comment?
wrongLine =
_ a:wrongInst? _ b:wrongComment?
to
line =
_ a:inst? _ b:comment? !.
wrongLine =
_ a:wrongInst? _ b:wrongComment? !.
Here's my parser:
My input string is
[]=[]++[]
.start = line
, the match fails. (correct behavior)start = wrongLine
, the match succeeds. (correct behavior)start = line / wrongLine
, the match should succeed by the ordered or rule, but it fails. (incorrect behavior)Please help me out with this one