yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
880 stars 112 forks source link

Bug in negation predicate #210

Closed mingodad closed 2 years ago

mingodad commented 2 years ago

While trying this grammar:

STRING_LONG <-
     '[' '='* '['  ( ! (']' '='* ']') . )* ']' '='* ']'

With this input:

[==[One string]]==]

I'm getting this error, while I was expecting none, and the AST is shown correctly:

1:16 syntax error, unexpected ']', expecting '='.

See discussion here: https://github.com/ChrisHixon/chpeg/issues/17#issuecomment-1146876233

ChrisHixon commented 2 years ago

I believe this is a misunderstanding in how PEG works. The '=' being optional with * inside the negation predicate makes it work like !(']' ']') or simply !']]'. The inner string is matching to the g in string, then one ] matches, then zero '='*, then one ]. It's left with ==] in the input. cpp-peglib doesn't seem to indict there are excess characters left in the input.

The error in chpeg from this:

Extraneous input: parse consumed 16 bytes out of 19
  Expected: "=" in STRING_LONG at offset 15
yhirose commented 2 years ago

cpp-peglib works exactly as @ChrisHixon mentioned.

Here is the result (I added EOF !. at the end for better error position) on cpp-peglib playground:

image

I think this is the natural PEG behavior since I faithfully implemented according to the Bryan Ford's paper, and I didn't do anything special. In order to accomplish what you probably want to do, I added another 'not predicate' operator.

image

I am not sure if it covers all the situations that you are thinking of. But at least it fixes the error in your example. Hope it helps.

mingodad commented 2 years ago

Thank you ! It seems that I've got a bit confused here based on the results of trying to extend chpeg.