lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.62k stars 395 forks source link

Transformer Not Applying Expected Transformations in Lark Parser #1416

Closed chenshimeng closed 2 weeks ago

chenshimeng commented 1 month ago
## Description
When using the Lark parser with a custom Transformer to parse and transform a simple grammar, the expected transformations specified in the Transformer class do not seem to be applied. Instead of getting the transformed output, I receive the raw parsed data.

## Steps to Reproduce
Here is the minimal code snippet to reproduce the issue:

```python
from lark import Lark, Transformer

grammar = """
?start: acos_func
?acos_func: ("acos" | "ACOS") "(" NUMBER ")"
NUMBER: /-?\d+(\.\d+)?/
%import common.WS
%ignore WS
"""

class MyTransformer(Transformer):
    def acos_func(self, args):
        return "ACOS called with argument: " + str(args[0])

parser = Lark(grammar, parser='lalr', transformer=MyTransformer())
test_input = "ACOS(1.0)"
print(parser.parse(test_input))

Expected Behavior

I expect the output to be:

ACOS called with argument: 1.0

This is based on the transformation defined in MyTransformer class for the acos_func.

Actual Behavior

Instead of the expected string, the output is just:

1.0

This suggests that the transformation acos_func in MyTransformer is not being applied.

Environment

erezsh commented 1 month ago

I think that you're right that your function isn't being called.

That's because you defined it as ?acos_func, and the "?" tells Lark that if there is only 1 argument, to inline the rule and not call its callback. And your rule only has one argument, NUMBER.

chenshimeng commented 1 month ago

Thank you for your assistance and explanation regarding the behavior of the ? prefix in Lark's grammar rules. Following your guidance, I have conducted further tests and confirmed that removing the ? prefix indeed allows the acos_func transformation method to be called as expected. Additionally, I discovered that if I retain the ? prefix in the rule ?acos_func: ("acos" | "ACOS") "(" NUMBER ")" but append -> acos_func at the end, the transformation method acos_func also gets called correctly.

Could you please advise which solution is considered more optimal or appropriate in terms of Lark's best practices? Are there any considerations regarding performance, readability, or future maintenance that I should be aware of when choosing between these two approaches?

Thank you once again for your support. Your explanation was instrumental in helping me understand and resolve the issue.

erezsh commented 2 weeks ago

Well, alias (->) doesn't have performance costs. Using ? does have a small cost, but I think it's very negligible.