Open humanitiesclinic opened 4 years ago
Sorry If the grammars or file to parse are very big. I have edited to mark them as code, so it is neater.
Are there any suggestions?
Hi. First I must say that I never expected this lib to parse files this big - when I created it Itried to create tool which will be easer to use than LALR parser and more expresive than regex. And I was devoted to sacrifice performace for this.
Hovever I parsed 1MB json in ~1s so i gues it is possible to parse your c file in simmialar time. To achieve this you must tweak your grammar a bit. I noticed 2 things when analyzed your grammar:
1: you use al lot of following pattern: x_list :=> x :=> x_list x. or x_list :=> x :=> list_x x. i recommend to rewrite in a greedy way like this: x_list :=> x x_list :=> x (note that x_list is on the right side in first rule) or x_list :=> x++
2: you should really use more regexes. For example: instead of decimal_constant :=> nonzero_digit :=> decimal_constant digit. use decimal_constant :=> /[1-9][0-9]/. and simmilarry: octal_constant :=> /0[0-7]/ hexadecimal_constant :=> /0x[0-9a-fA-F]+/ and most common therefore most important: identifier :=> /([_a-zA-Z]|\u[0-9a-fA-F]{4})([_a-zA-Z0-9]|\u[0-9a-fA-F]{4})*/.
ok noted tq v much
I finally have a grammar with no errors. However, when parsing a file, it takes very very long. Is there a way to know how long it is expected to take, and is there any way to make it faster? I tried enabling PEG but it seems to take forever as well.
The string to parse:
The grammar: