yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
879 stars 112 forks source link

Last-resort failure #290

Closed kfsone closed 5 months ago

kfsone commented 5 months ago

I was trying to use error recovery as a last resort failure, but discovered that recover matches have the same precedence as regular matches, so the naive:

file <- list_of_a / list_of_b
list_of_a <- a+ '.'
list_of_b <- b+ '.'
a <- 'a' / 'b' %recover(x)
b <- 'b' / 'a' %recover(x)
x <- '' { error_message "don't do that" }

accepts "aaaaa." but rejects "b." as a failed list of as.

(source problem: trying to ensure the sameness of a list of tokens, like trying to ensure they're all integers vs strings)

What I am trying to achieve is:

as : as a | a
   | (as b { error_message "please don't mix" })
(that is: at least one "a" followed by a "b")
bs : bs b | b | (bs a { error_message "stop mixing already!" })

but expressing that in peg feels clumsy to me:

file <- a_list / b_list
a_list <- 'a' ('a' / 'b' %recover(x))* '.'
b_list <- 'b' ('b' / 'a' %recover(x))* '.'
x <- '' { error_message "please don't mix" }

The "last-resort" in the title is referring to the question of whether there is a way to mark a recovery as low precedence so that a / b^label doesn't try to match b aggressively here but as a last-resort?

mingodad commented 5 months ago

I've just added your bison/flex grammar to https://mingodad.github.io/parsertl-playground/playground/ an Yacc/Lex compatible online editor/tester that uses an UI based on cpp-peglib playground (select Parseland parser from Examples then click Parse to see a parse tree for the content in Input source).

I did implemented some missing pieces there to be able to parse the example shown in your README.

kfsone commented 5 months ago

@mingodad Amazing! Ahh, I was hacking together a flex++/bison version of it in that folder, but lex/yacc may be a better starting place - I was already hating the amount of work to reach ground zero using flex++/bison even though I'm leaning on three grammars I've done with them including https://github.com/kfsone/flub

Thanks for the awesome work and the links, will give you credit in the README.md.

yhirose commented 5 months ago

@kfsone I haven't taken a look at it carefully though, is it a bug in cpp-peglib or just a question for your grammar?

kfsone commented 5 months ago

@yhirose wow, I just reparsed my own issue text ... I think I was doing too many things at once, humble apologies. I'll edit the issue with a better, trivial summary.

yhirose commented 5 months ago

The "last-resort" in the title is referring to the question of whether there is a way to mark a recovery as low precedence so that a / b^label doesn't try to match b aggressively here but as a last-resort?

The only way that I can think of is actually same as your clumsy solution. :)

yhirose commented 5 months ago

If we use a macro, it can be a bit simpler.

file       <- a_list / b_list
a_list     <- list('a')
b_list     <- list('b')
list(elem) <- elem (elem^x)* '.'
x          <- .* '.' { error_message "please don't mix" }
kfsone commented 5 months ago

using the macro, tho, that makes it quite elegant. HatTipGreetingsGIF