igordejanovic / parglare

A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/
MIT License
135 stars 32 forks source link

Asm parser CPU/memory hungry #120

Closed xor2003 closed 4 years ago

xor2003 commented 4 years ago

Description

Comilation eats a lot of CPU and memory. I don't know how to troubleshoot.

What I Did

EBNF: _masm61.zip

pglr --no-colors --debug  compile _masm61.pg  
igordejanovic commented 4 years ago

Thank for the report. Your grammar is huge and has a huge number of conflicts. It will be a good input for optimization. Will investigate in the following days.

xor2003 commented 4 years ago

Maybe some additional diagnostics possible to add?

igordejanovic commented 4 years ago

Definitely will need some tracing info during table construction.

igordejanovic commented 4 years ago

I've added debug printouts during table construction and removed some unnecessary string constructions. It now compiles.

BTW, your grammar can be optimized in this part:

decNumber
  :decdigit+
  ; 
digits
  :decdigit+| digits hexdigit
  ; 
...
radixOverride
  : 'H' | 'O' | 'Q' | 'T' | 'Y'
  ; 
decdigit
  : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
  ; 
hexdigit
  : 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
  ; 
letter : "A" | "B" | "C" | "D" | "E" | "F" | "G"
       | "H" | "I" | "J" | "K" | "L" | "M" | "N"
       | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
       | "V" | "W" | "X" | "Y" | "Z" | "a" | "b"
       | "c" | "d" | "e" | "f" | "g" | "h" | "i"
       | "j" | "k" | "l" | "m" | "n" | "o" | "p"
       | "q" | "r" | "s" | "t" | "u" | "v" | "w"
       | "x" | "y" | "z" | "_";
id: letter
| id letter
| id decdigit;

You could use regex matches for these rules instead of using the parser to do the parsing of lexemes (id, hexdigit, decdigit, radixOverride). By making regexes for these the number of states will be reduced, table construction and the parsing itself will be faster.

xor2003 commented 4 years ago

Thank you! Is it much better now.