Closed mitranim closed 1 year ago
Here's an example where it's not immediately obvious to me why it would overflow (which it does):
main = (~func)*;
func = ident? (~main)*;
ident = '\b[[:alpha:]_][[:alnum:]_]*\b';
I started writing an SBNF file based on JSON5's grammar, and quickly ran into this issue (at least I think it's this issue).
The code is here, though it's not done and I think some of my regexes are invalid for sbnf:
https://github.com/bschwind/sublime-json5/blob/main/json5_grammar.txt
The feature that all these examples have in common is left recursion. The first example is direct left recursion, the second is indirect. Since the first term ident?
in the definition of func
can be empty, main
has indirect left recursion since main -> func -> main
is a valid expansion.
@bschwind you have left recursion here and here. In those cases the fix is easy, just swap a : b | a ',' b ;
into a : b | b ',' a ;
. The new rule has right recursion but that's fine in top-down parsers. A more concise version of that rule is a : b (',' b)*;
.
@nuchi thank you for that explanation, it makes good sense. Glad it can be rearranged to not cause infinite recursion. This is the first time I've worked directly with grammars like this so I'm learning a lot here.
The following trivial definition causes a stack overflow:
In this simple example, the writer can probably immediately understand what this means. In more complex examples, the cause can be harder to track down. Someday it might be worth fixing, but doesn't seem like a priority.