lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.75k stars 401 forks source link

Undefined rule error when trying to create a parser using some of the included grammars #1310

Closed Elegantyear closed 1 year ago

Elegantyear commented 1 year ago

Describe the bug

When attempting to create a parser using any of the included grammar files except for lark.lark, an undefined rule error occurs.

lark.exceptions.GrammarError: Using an undefined rule: NonTerminal('start')

The same error occurs for python.lark, common.lark, and unicode.lark.

The rule that appears to be causing the error appears to be [Rule(NonTerminal('$root_start'), [NonTerminal('start'), Terminal('$END')], None, RuleOptions(False, False, None, None))] which is added to every rule set as part of the GrammarAnalyzer in grammar_analysis.py. Since it is added regardless of the input grammar, I suspect that it is not the actual cause of the issue, but my debugging skills are insufficient to explore further.

To Reproduce

The following appears to be a minimal example:

import lark
from pathlib import Path

parser = lark.Lark.open(Path(lark.__file__).parent / 'grammars/common.lark')
MegaIng commented 1 year ago

You need to provide your own start symbol. For the python grammar, this for example is file_input. The other two, common and unicode are actually just token libraries and it doesn't make sense to parse them, they don't have any rules defined.

Elegantyear commented 1 year ago

Thank you for the help, however I am still confused.

Do I add something like start: file_input to the start of the python grammar file? That seems to be what is indicated in the docs and examples, but I can't get it to work.

Alternately, would I do something like parser = lark.Lark.open(Path(lark.__file__).parent / grammars/python.lark', start='file_input')? That doesn't seem to work either.

MegaIng commented 1 year ago

Both of those will work, with the latter being preferable if you want to use the bundled grammar.

Elegantyear commented 1 year ago

The issue ended up being that I wasn't explicitly specifying a parser, which for some reason causes it to break.

Doesn't work: parser = lark.Lark.open(Path(lark.__file__).parent / 'grammars/common.lark', start='file_input')

Works: parser = lark.Lark.open(Path(lark.__file__).parent / 'grammars/python.lark', parser='lalr', start='file_input')

Thanks for the assistance.