lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.64k stars 397 forks source link

Token defined in rule B reports UnexpectedToken by using in rule A #1337

Closed Fxztam closed 10 months ago

Fxztam commented 10 months ago

Hi,

I am working on a pattern-searching of statements in a given PL.

Now I am using a select rule with an included into token:

select : "select"i rest_operat "into"i rest_operat "from"i rest_operat SEMICOLON

There is another rule:

rest_operat : (SYMBOLS | string | IDENTIFIER | word)* rest : rest_operat (rest_operat)* SEMICOLON

But when I try to use the token into with the rule rest I get the following error:

`lark.exceptions.UnexpectedToken: Unexpected token Token('INTO', 'into') at line 17, column 13. Expected one of:

How can I solve the problem that the same tokens have to be used in several rules?

Thanks, Fried

erezsh commented 10 months ago

Your question isn't clear. Try to write a minimal, 5 to 10 lines examples that shows your problem.

Fxztam commented 10 months ago

I have two rules: rest, select and in the select rule there is a token into:

select : "select"i rest_operat "into"i rest_operat "from"i rest_operat SEMICOLON

In the short demo I can see that the tests of these rules work; you can comment the test-line -- tester blah into; and all is fine.

But if I try to use the tokens into or from (defined in the select rule) in the test-line:

tester blah into then I will get the issue hint:

lark.exceptions.UnexpectedToken: Unexpected token Token('INTO', 'into') at line 11, column 17. Expected one of: * __ANON_0 * SEMICOLON * LCASE_LETTER .

So the select token set {into, from} is included in the rest token set (the tokens superset); is this the issue here and how can I solve this, please?

Thanks, Fried

erezsh commented 10 months ago

So, in other words, you want "into" to be both a keyword and a variable name, based on the context?

Fxztam commented 10 months ago

Yes, "into" is a keyword for the rule "select" and there are some other "into"s in the "rest" rule context, but the other "into"s should be ignored (since not used as a keyword in the "rest" rule).

Thanks, Fried

erezsh commented 10 months ago

Well, the simplest solution is something like this

                        rest_operat     : WORD* | "into"i
Fxztam commented 10 months ago

Thank you so much for your answer!

Yes, the magic is in the rest_operat rule definition::

rest_operat : (LCASE_LETTER | word)* | "into"i | "from"i

So I have to extend this rule with used keywords of the search pattern rules and it works.

Thanks again 👍 , Fried

erezsh commented 10 months ago

Happy to help.

For next time, it will save time if you include with your question a minimal code example that reproduces your problem.