scymtym / esrap

Common Lisp packrat parser
https://scymtym.github.io/esrap/
78 stars 12 forks source link

Resume from position of failure? #12

Closed mthom closed 4 years ago

mthom commented 4 years ago

I have a problem with rules that are too greedy, and I'm not sure esrap has a good built-in way of handling it. Suppose I have this grammar:

(defrule parse-expr (and chars "()"))

(defrule chars (* (or (alpha-char-p character) "(")))

chars greedily consumes the "(", and next throws a parse error. Doesn't esrap perform backtracking? If not, shouldn't it be possible to catch an error from one rule, and restart the parse from condition object's position, perhaps using a different rule?

scymtym commented 4 years ago

Doesn't esrap perform backtracking?

Esrap is a recursive descent parser with memoization but without backtracking. Repetition expressions are greedy. For "alternative" (or) expressions, the first succeeding alternative is chosen definitively.

If not, shouldn't it be possible to catch an error from one rule, and restart the parse from condition object's position, perhaps using a different rule?

To guide Esrap towards a different parse locally to make the parse succeed globally, the esrap:! (input must not match) and esrap:& (input must match) constructs can (or must, really) be used:

(esrap:defrule chars
    (* (or (alpha-char-p character)
       (and (esrap:! "()") "(")))
  (:text t))

(esrap:parse 'parse-expr "f(oo(bar()") ; => ("f(oo(bar" "()")

The expression (and (esrap:! "()") "(")) looks at the input at the current position and fails if the it matches (). If the input at the current position does not match (), (esrap:! "()") succeeds without advancing the current position. So the input () does not match but ( does.