Closed russells-crockpot closed 2 years ago
I was able to create a toy grammar that will reproduce it. when 1
or (1)
is input, it will work, but when 1(0)
is put in, it hangs.
start <- number eos*
eos <- &'('
number <- [0-9]+
The string 1
matches the grammar; 1
matches number
, and then we attempt to match eos
which fails, so eos*
consumes no input.
The string (1)
is rejected because the first character is (
which does not match number
.
1(0)
matches the char 1
using number
, and then attempts to match eos
. The next char is (
, so the rule &'('
matches, but then the eos
rule does not consume any further input. So eos*
loops forever creating an empty node and not advancing the cursor.
Using a lookahead as the final element in a rule, or as the only element, is likely to cause this behaviour and should be avoided. You should always follow a lookahead with something that is guaranteed to consume input, especially if you're inside a loop.
Can I close this issue or did you have further information about it?
When defining a rule such as:
end_of_statement <- ';' / eol / &']'
Then the parser (in python and javascript, at least) will hang forever. It only seems to do this when it actually encounters that character, however. It also occurs with negative lookaheads.