Closed nd closed 1 year ago
It looks to me like this is exactly the correct behaviour.
According to this spec, when the scanner encounters a line with a single #
input, the first rule matches, the #
is pushed back, and the state CONFIGURABLE_EOL_COMMENT
is entered, and nothing is returned, so the scanner continues to match input. The next input is again #
, because it was pushed back into the stream. Because we are in state CONFIGURABLE_EOL_COMMENT
, two rules can now match. For a line with a single #
, both have the same match length, so the rule with higher priority (earlier in the file) is the one that is chosen. This rule again pushes back #
, etc.
To break the cycle, you can either put the <CONFIGURABLE_EOL_COMMENT>
rule first, or guard the "#"
rule with <YYINITIAL>
so that it is not available in state <CONFIGURABLE_EOL_COMMENT>
.
Thanks, I didn't know that match length matters. It indeed explains the behavior.
I get a situation where lexer works differently depending on an input and I think this might be a bug.
The lexer has an option to treat '#' either as a separate token or as a start of end of line comment (
hashStartsComment
).When
#
is encountered and it should start a comment, I push it back so that it is included in the comment, and enter a state for the comment.What I observe is that the lexer works fine for lines like
#comment
, but hangs for lone#
on a line.My understanding is that after I pushed the
#
back, the same rule for "#" is matches again entering the endless loop. This is fine.But why it doesn't enter the endless loop for
# comment
line? It looks like a bug.