igordejanovic / parglare

A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/
MIT License
136 stars 32 forks source link

Generate a parser as python code #20

Open KOLANICH opened 6 years ago

KOLANICH commented 6 years ago

Parglare is damn slow. Generating parser is also damn slow. It may be useful both from speed and debug perspective to serialize the generated parser into a python file.

igordejanovic commented 6 years ago

Parser table serialization is something that I plan to do soon.

What have you tried to do and get slow parsing? Which parser did you use? LR or GLR? parglare is not optimized yet but for my use-cases is fast enough. Maybe you are using GLR and have a lot of non-determinism in the grammar so parglare builds a lot of trees, i.e. investigates a lot of possible interpretations. I can't help any further without more input.

KOLANICH commented 6 years ago

I'm using LR for a grammar more complex than a simple expression language. In fact I have converted a LL(?) grammar (more than 200 lines) for CoCo/R into parglare one because it used some extensions I have not found in publicly available implementations of CoCo/R, such as "zero or more repeats". A try of "compile grammar" -> "gen parser" -> "test on file" pipeline takes nearly a minute with cpython on the hardware available to me at work (I remember that the CPU is some Celeron with SSE2, but without x86_64 ). There are some ambiguites, I know that I would have to use glr somewhen, but it is slow even in LR mode.

igordejanovic commented 6 years ago

How big is the file you are parsing? For LR you should have at least 300Kb/sec more-less depending on the language. GLR has additional overhead and is at the moment approximately 5 times slower.

Can you publicly share your code, grammars, input files? If not can you do some measurements like time needed for the parser to initialize, parsing speed in Kb/sec. I haven't yet optimized parglare extensively but I'm keeping an eye on performance so I would like to take a look.

KOLANICH commented 6 years ago

importing a grammar takes 1.3 sec, generating a parser takes 154 sec .

igordejanovic commented 6 years ago

Related to #52 and #36

igordejanovic commented 6 years ago

First version of table caching is on the master branch. Please see the updates in the HISTORY.md and the docs. Please test with your code and let me know it you have any trouble.