lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.64k stars 397 forks source link

Data structure for getting possible terminal sequences? #1380

Closed RevanthRameshkumar closed 6 months ago

RevanthRameshkumar commented 6 months ago

Is there any internal or exposed datastructure/method for seeing all possible next terminals that can come after a specified terminal (independent of context). This is outside of using the interactive parser (and outside of parsing a string). Given a parsed grammar, and given a terminal, I just want to know all other terminals that can follow.

For example:

start: one | two | three

one: A B
two: A C
three: B A

A: "a"
B: "b"
C: "c"

using the grammar above, we can get the following

A -> [B, C] B -> [A] C -> []

erezsh commented 6 months ago

Yes, it's called the follow set. It gets set here: https://github.com/lark-parser/lark/blob/master/lark/parsers/grammar_analysis.py#L178

It gets built indirectly by the parsers.

But you can probably initialize the GrammarAnalyzer yourself, by providing it with a ParserConf instance. I believe it's the same instance that Lark creates here: https://github.com/lark-parser/lark/blob/master/lark/lark.py#L487

RevanthRameshkumar commented 6 months ago

That is perfect, ty!