lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.62k stars 395 forks source link

Dynamic Earley: Incorrect value for SymbolNode.end #1431

Closed chanicpanic closed 1 week ago

chanicpanic commented 1 week ago

Consider the following:

grammar = r"""
start: "ABC"
"""

print(Lark(grammar, ambiguity="forest").parse("ABC"))

Output: (start, 0, 2, -inf)

The end index is 2 which suggests that the substring "AB" was parsed. However, we know that the full string was parsed, so the end index should be 3.

Note that the issue does not occur if we use a rule instead:

grammar = r"""
start: abc 
abc: "ABC"
"""

print(Lark(grammar, ambiguity="forest").parse("ABC"))

Output: (start, 0, 3, -inf)

I will open a PR with a fix.