Open ldevyataykina opened 6 years ago
How are you instantiating the parser?
@erezsh I use pip
or better way clone directory?
No, I mean, how are you calling the Lark object.
See these examples of how to parse indentation: https://github.com/erezsh/lark/blob/master/examples/indented_tree.py https://github.com/erezsh/lark/blob/master/examples/python_parser.py
@erezsh I use json_parser = Lark(grammar, parser='lalr', postlex=PythonIndenter(), start='value')
and it returns me an error
I can't reproduce the error you're getting, but try always adding a newline at the end of the text:
parser.parse(text + '\n')
If that doesn't work, try to reduce the grammar and input-text into a simpler grammar and text that still produces the same error.
@erezsh this error appear when
json_parser = Lark(grammar, start='value')
tree = json_parser.parse(file)
But with
json_parser = Lark(grammar, parser='lalr', postlex=PythonIndenter(), start='value')
tree = json_parser.parse(file + '\n')
it returns
UnexpectedToken: Unexpected token Token(STRING, 'x') at line 1, column 5. Expected: dict_keys(['condition', 'VARIABLE_NAME', 'variable', 'SIGNED_NUMBER']) Context: <no context>
The reason for this error is that your terminal for STRING matches the same text as VARIABLE_NAME.
I assume you meant to put quotes around your string, like:
STRING: /"[\wа-яА-Я0-9_.-]+"/
@erezsh unfortunately, it doesn't help. But it's more priority to solve problem with indentation. Can you say, where is in my grammar error, connected with it?
Why did you decide that the problem is with indentation?
"try to reduce the grammar and input-text into a simpler grammar and text that still produces the same error." This is still my advice.
@erezsh I have working grammar, but after trying to add correct indentation to input code, I get an error https://github.com/ldevyataykina/ldsl_grammar/blob/master/1_ver.ipynb
It looks like you have a working grammar with the Earley algorithm, which is a lot more forgiving than LALR. Unfortunately, there is no simple way (that I'm aware of) to make Earley support indentation (since it's not context-free).
If you can make this work, even without indentation:
Lark(grammar, parser='lalr', start='value')
Then it will be much easier for you to add indentation later.
@erezsh with argument parser='lalr'
doesn't work
UnexpectedToken: Unexpected token Token(STRING, 'возраст') at line 1, column 5. Expected: dict_keys(['__ANONSTR_6', '$END', '__RSQB', '__RBRACE', '__ANONSTR_5', '__COMMA', '__ANONSTR_7', 'condition', 'VARIABLE']) Context: <no context>
why it expects dict_keys?
It doesn't expect "dict_keys", it expects one of the terminals listed there. I already told you why this happens. "The reason for this error is that your terminal for STRING matches the same text as VARIABLE."
I've written grammar
and I try to execute next code:
but it returns
ParseError: Unexpected end of input! Expecting a terminal of: ['_NEWLINE', '_DEDENT', '__ANONSTR_3', '__ANONSTR_3', '__ANONSTR_3', 'STRING', '__ANONSTR_3']
I need to add correct indentations to code to get tree. how can I fix that to read this text? @erezsh , can you help with this question?