Closed jkm closed 12 years ago
On Mon, Aug 13, 2012 at 11:35 AM, jkm notifications@github.com wrote:
Assume you want to parse the expression min(0, max(2, 3)). Then each alternative is checked first. E.g. before "min" is parsed 3 * 3 * 3. I.e it is especially costly if the last alternative will always succeed. I assume this why pegs can take exponential time to parse.
Yes.
I think a packrat parser solves those issues by some memory trade-off. For some (probably most) grammars a LR parser may work even better in practice. Are there any plans to support more efficient parsing schemes? Are they possible to integrate with the current design?
Yes also. Pegged is just a parser generator and I knew just a little bit about PEGs until now. I'm trying to become more literate in general parsing technology (LL(1), LALR, GLR, LL(*), and so on).
My current goal is to have PEG generate a LL(1) or LALR(1) parser. GLR
could come afterwards, and a fully-backtracking recursive-descent parser,
for ambiguous grammars. I'll see. The basic idea is the same: enter a
grammar in a natural way and call
grammar!(someCompileTimeArgument)(grammarText)
, where
someCompileTimeArgument
will most probably serve to select the parser
generator.
It'll take a 2 months, at least, to get LALR and LL(1) working. I have some very basic code, but nothing functional right now.
I see. Nice to hear that you are already aware of it. Even though it makes may current development cycle slow for certain input I can stay away from these inputs for the time being and work on other things first.
On Mon, Aug 13, 2012 at 4:02 PM, jkm notifications@github.com wrote:
I see. Nice to hear that you are already aware of it. Even though it makes may current development cycle slow for certain input I can stay away from these inputs and work on other things.
I can try to implement packrat parsing. That's what should require the lowest amount of modifications.
It's not that urgent and having a LALR or LL(1) parser is more important in the long run IMHO. Currently it's sufficient for me to know where the problem is. I just was surprised by the run-time behavior. From my point you can safely follow your schedule and goals.
I have a grammar where some input is parsed very slowly. Consider a grammar like:
Assume you want to parse the expression
min(0, max(2, 3))
. Then each alternative is checked first. E.g. before "min" is parsed 3 * 3 * 3. I.e it is especially costly if the last alternative will always succeed. I assume this why pegs can take exponential time to parse. I think a packrat parser solves those issues by some memory trade-off. For some (probably most) grammars a LR parser may work even better in practice. Are there any plans to support more efficient parsing schemes? Are they possible to integrate with the current design?