lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.76k stars 402 forks source link

Earley recogniser - Leo transitives #397

Open deniskyashif opened 5 years ago

deniskyashif commented 5 years ago

Hi, I noticed that there's an implementation of Joop Leo's optimisation for right recursion, but it is commented out.

I'm fairly new to the concept of this algorithm so I'm trying to learn. Is there a reason to omit this, or maybe there's some other way of achieving linear time for LR(k) grammars?

erezsh commented 5 years ago

It was commented out due to bugs in some edge cases.

erezsh commented 5 years ago

Maybe @night199uk has a better answer.

night199uk commented 5 years ago

The Leo implementation had a minor bug that that sometimes causes duplicate start symbols. I believe this is due to the fact that the original Joop Leo paper/implementation calls for an injected start symbol S' => S to root the generation of transitives. Unfortunately, I'm not in a position I can provide a fix right now; but hopefully that provides you some breadcrumbs.