lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.77k stars 404 forks source link

Unify `Token` and `Tree` position information. #1254

Open HolonProduction opened 1 year ago

HolonProduction commented 1 year ago

For trees the position information is placed inside the meta attribute. When using Token the information is directly stored in attributes. This makes it hard to write code which accepts Tree and Token. It would be good if there was an unified way to get this information.

I think it would make sense to add a meta attribute to tokens as well. To keep backwards compatibility the direct attributes could be replaced with @property to return the values from meta.

erezsh commented 1 year ago

I think it's not a bad idea. But there are also performance considerations, since tokens are a very common object, and splitting it into two objects can affect the speed and memory consumption of Lark. (for trees it's different, because collecting meta is optional, and also its meta has a lot more attributes than the token, and Tree has a lot more methods in general, which crowds the namespace)

As a temporary solution, you could write a _get_meta function that returns .meta for a Tree, or the token itself for tokens.