lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.61k stars 395 forks source link

Trees are always simpler to work with than state-machines. #4

Open kootenpv opened 7 years ago

kootenpv commented 7 years ago

Hey, it looks very interesting!

You make this statement 2 times: "Trees are always simpler to work with than state-machines."

Could you back this up with some resource or examples? I'm very curious.

erezsh commented 7 years ago

Hi, thanks! The simple solution is that the parse-trees that lark produces are reduce-able to a parser's state-machine. If you use the Transformer class on a tree, it will call your methods at the exact same order you would get in a traditional parser, and your return-values will be treated the same. So at the very least, a parse-tree is equivalent to a state-machine. But: 1) Trees allow you to see the "state-machine" visually: At once, instead of in steps. You can also see the tree fold step-by-step as you process it. 2) Trees allow your computation to be aware of previous and future states, so they are more powerful computationally. 3) You can process a tree in steps. That means you can separate your processing into several logical steps if you like. A state-machine requires that each method performs a complete computation.

I am not aware of any benefits for state-machines, besides performance. But Lark solves that too: Once you write a complete transformer, you can hand it off to Lark and it will execute it as a state-machine, without building a tree (for example, when parsing json, it uses about half the memory)

charles-esterbrook commented 3 years ago

This could go in the docs somewhere if it's not already there.

erezsh commented 3 years ago

@charles-esterbrook It's mentioned here: https://lark-parser.readthedocs.io/en/latest/philosophy.html?highlight=detail%20here#always-build-a-parse-tree-unless-told-not-to

charles-esterbrook commented 3 years ago

Cool, then I think this ticket can be closed.