lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.88k stars 414 forks source link

Are custom post-lexers supported? #1206

Closed ederic-oytas closed 2 years ago

ederic-oytas commented 2 years ago

What is your question?

Hi, I noticed that the postlex option has the type Optional[PostLex]. However, the PostLex class is not found anywhere in documentation nor is it found in the __init__.py file. This makes it impossible to create custom postlexers without subclassing a class that is undocumented and possibly unstable. So is there another way to use a custom postlexer? If not, I'd be happy to try to implement a means to do so.

MegaIng commented 2 years ago

PostLex is not a class that needs to be subclasses, it's a protocol. All you need is an object with a process method and a always_accept attribute. But you are correct, this should be explained in the docs for Lark.__init__.

(Note: we are not using a proper Protocol from typing since we are still supporting python3.6)

ederic-oytas commented 2 years ago

So are there any plans to make this officially part of the API for Lark?

MegaIng commented 2 years ago

It is fully officially part of the API. What makes you think it's not?

ederic-oytas commented 2 years ago

The fact that it is undocumented makes it seem that it may be subject to change in the future.

MegaIng commented 2 years ago

It is indirectly documented: the postlex argument is documented, so the way it behaves won't change. Same does for the Indenter class. You can also look up the PostLex class. Yes, it's not exported by __init__, but neither are many other public APIs in the .transformer module. You can import PostLex from lark.lark.

ederic-oytas commented 2 years ago

Alright, thank you for the clarification. Hopefully this will also be explained in the docs in the future for any future developers.