Open akonsu opened 7 years ago
Perhaps this is related to your work on eliminating the exponential memory consumption when working with ambiguous grammars?
No, sorry; the parsing algorithm in master has been stable for a while now (since I did a big push of making things faster).
I’m afraid I don’t have time to look through your grammar, but I suspect it is, in fact, ambiguous :-)
The other option is that it’s right-recursive, but I don’t think that’s the case here.
I hate to say it, but have you tried switching to a tokenizer? It tends to help with performance issues!
Sent with GitHawk
Yes, it is ambiguous, and maybe recursive. That was a part of the test. I wanted to see how much it can handle. And, yes, I am going to switch to a tokenizer. Do you mean that there is not much that can be done in the code to alleviate the memory issues? Other than being careful when writing grammars?
Ah, your grammar is definitely right-recursive, due to the expr -> identifier _ "+" _ expr
rule. Earley doesn't like right-recursive rules very much; one of the things on my to-do list is to re-implement Leo's optimisation, which should help with this. For now, you should prefer left-recursive rules where possible (e.g. expr -> expr _ "+" _ identifier
), since they perform better. I think that's the problem you're having here.
I don't think it's relevant here, but a note on ambiguity anyway: While Nearley does provide all parsings for ambiguous grammars/inputs, we don't (yet) use an efficient representation (the preferred one is a "packed parse forest"). Since we end up producing an array containing each parse, performance degrades the more ambiguity you have. Generally this isn't a problem, since you want to aim to write an unambiguous grammar!
I have a script
mkexpr.js
that generates a very big expression:My grammar
wl-grammar.ne
is:Then I do
node mkexpr.js && nearleyc.js wl-grammar.ne -o wl-grammar.js && more test.wl | nearley-test.js wl-grammar.js
This produces an error:
Perhaps this is related to your work on eliminating the exponential memory consumption when working with ambiguous grammars?