dominiccooney / guruguru

A PEG parser library in Emacs Lisp.
http://wiki.github.com/coonsta/guruguru
4 stars 2 forks source link

Nullable left-recursive rules fail #1

Open dominiccooney opened 14 years ago

dominiccooney commented 14 years ago

This grammar fails to parse "aaaa":

(gg-grammar
 (start n := x eof -> n)
 (x n := x a -> (+ 1 n)
   | -> 0)
 (a ?a))

The intent is similar to this grammar:

(gg-grammar
 (start n := y eof -> n)
 (y n := x -> n
  | -> 0)
 (x n := x a -> (+ 1 n)
  | a -> 1)
 (a ?a))

It seems nullable left-recursive rules fail, perhaps because they're not "making progress."

dominiccooney commented 14 years ago

Interestingly this equivalent grammar succeeds:

(gg-grammar
  (start n := xs eof -> n)
  (xs
     n := xs ?x -> (1+ n)
   | -> 0))
dominiccooney commented 14 years ago

I think this is a bug in the left-recursive PEG algorithm; pinged gurus at https://lists.csail.mit.edu/pipermail/peg/2009-November/000244.html

dominiccooney commented 14 years ago

I think the best thing to do is to disallow nullable left-recursive rules, probably by warning about them and failing if they're activated.

dominiccooney commented 14 years ago

After adding some tracing, the problem is the heuristic in gg-recall that doesn't activate rules which aren't involved in the recursion. In the first grammar above, rule a is not involved in the recursion (only a is) and so gg-recall eagerly memoizes failure for it.

If that heuristic is relaxed, the parser succeeds and computes the right result. However this has bad downstream effects when a is right-recursive. Laurence Tratt has drafted a paper about this at https://lists.csail.mit.edu/pipermail/peg/2009-November/000248.html