Closed akonsu closed 7 years ago
Nope, youre not gonna run out of stack :-) Nearley uses the Earley algorithm, which is a kind of chart parser.
It can even parse ambiguous inputs, but currently it generates every parse, which can lead to exponential blowup. We're working on that one, though!
Sent with GitHawk
Thanks! Now that I am more familiar with the algorithm, I see that there are no recursive function calls involved. How efficient is your implementation with regards to creating all these sets? There is a lot of duplication in these sets if you just use the Javascript's Set
object, right?
I need to parse Wolfram language, which has a lot of ambiguities and also a lot of operators, so parsing it can result in a lot of items being pushed to each set...
(I am also looking at error recovery, found a couple of academic papers describing modifications of Earley algorithm but this is a separate question)
How efficient is your implementation with regards to creating all these sets? There is a lot of duplication in these sets if you just use the Javascript's Set object, right?
We've done a lot of work to implement the algorithm efficiently—e.g. we're very careful only to process each item (State
) in each set once, even in the presence of nullable rules—so it's a pretty efficient Earley parser. But that's assuming there's no ambiguity; currently we don't de-duplicate identical items (this is something I'm actively working on), so you run into the exponential blowup problem I mentioned above.
I am also looking at error recovery
I've tried to make this work in the past, but it's quite a difficult problem! :)
I'm going to close this, but feel free to shout if you have any more questions. :)
As I understand, this is a top-down parser. How it is implemented? Is it a recursive descent? If I have a grammar with many levels of precedence, would it, at least theoretically, run out of stack while parsing complex expressions? I hope my question makes sense...