lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.75k stars 401 forks source link

Problem matching the body of annotations like `@Annotation(title, some longer description)` #1325

Closed jjalowie closed 11 months ago

jjalowie commented 1 year ago

What is your question?

I'm trying to write a parser for annotations like @Annotation(title, some longer description). I have trouble achieving consistent parsing of the description rule in the below grammar. Given @Annotation(title, some longer description), the description rule sometimes matches some longer description and sometimes it matches some longer description (note the leading space). I want to always enforce matching without leading whitespaces.

If you're having trouble with your code or grammar

I believe it's either a problem with the below grammar or a bug in the parsing process.

Code reproduction:

# File: parser.py

import lark

grammar = """
start: "@Annotation" "(" title "," description ")"
title: /[^,]+/
description: /[^)]+/

%import common.WS
%ignore WS
"""

text = "@Annotation(title, some longer description)"

parser = lark.Lark(grammar)
ir = parser.parse(text)
print(ir.children[1])

Explain what you're trying to do, and what is obstructing your progress.

It seems to me that ignoring the WS rule sometimes takes precedence over the description rule and sometimes vice versaand it happens randomly. Output of executing the same script a few times gives (again, note the leading space that sometimes gets matched for some longer description:


user@machine:/dir> python parser.py 
Tree(Token('RULE', 'description'), [Token('__ANON_2', ' some longer description')])
user@machine:/dir> python parser.py 
Tree(Token('RULE', 'description'), [Token('__ANON_2', 'some longer description')])
user@machine:/dir> python parser.py 
Tree(Token('RULE', 'description'), [Token('__ANON_2', ' some longer description')])
user@machine:/dir> python parser.py 
Tree(Token('RULE', 'description'), [Token('__ANON_2', 'some longer description')])
user@machine:/dir> python parser.py 
Tree(Token('RULE', 'description'), [Token('__ANON_2', ' some longer description')])
erezsh commented 11 months ago

Should be solved in the latest master.

jjalowie commented 9 months ago

Thank you!