yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
916 stars 113 forks source link

Odd unexpected #202

Closed kfsone closed 2 years ago

kfsone commented 2 years ago

Full grammar & example at the end. I'm seeing an unexpected claim that the parser is expecting an identifier where I think it should only be expecting a word from a named dictionary:

ROOT <- Block* EOI
Heading             <- "[Block" Name Repetition "]"
Name                <- < Identifier >
Repetition          <- "Once" | "Shuffle" | "InOrder"

Sees

[Block block-name Typo]

and complains:

2:15 syntax error, unexpected 'sometypo', expecting <Repetition>, <Name>.

Full grammar & example

ROOT <- Block* EOI

%whitespace         <- [ \t]* ( '#' [^\n]* '\r'? )?
EOL                 <- '\r'? '\n'
~EOI                <- EOL* !.
~NL                 <- '\r'? '\n'
~_                  <- [ \t]

Identifier          <- [A-Za-z][A-Za-z0-9-]*

%word               <- < Identifier >

# To ensure that there is nothing significant after the [...]
Block               <- EOL* Heading NL Lines*

Heading             <- "[Block" Name Repetition "]"
Name                <- < Identifier >
Repetition          <- "Once" | "Shuffle" | "InOrder"

Lines               <- "..."

example

# ignore comments
[Block myname sometypo]
#23456789|12345
...

2:15 syntax error, unexpected 'sometypo', expecting <Repetition>, <Name>.
yhirose commented 2 years ago

@kfsone, thank you for the feedback. One of the weaknesses of using PEG parser is poor error reporting. cpp-peglib supports the furthest failure error position report as described in the Bryan Ford original document. But unfortunately, it doesn't give us good error reports as you experienced in your grammar...

Please see the section Error report and recovery in README to implement better error messages and recovery with recovery operator. You can also see some examples in test/test2.cc. Hope it helps.

kfsone commented 2 years ago

test2.cc is a treasure trove. +9000 for that.

My problem is that it says it's expecting Repetition or Name after repetition, when that's clearly not what the grammar says:

Rule:

Heading <- "[Block" Name Repetition "]"

Source:

# line1
[Block myname sometypo]   # line2
2:15 syntax error, unexpected 'sometypo', expecting <Repetition>, <Name>.

which is the beginning of the 3rd token, which should be a Repetition.

[Block myname sometypo]   # line2
123456789|1234^  # col 15 = start of 'sometypo'

I reduced it even further, with manual whitespace control it works

image

but turn on whitespace skipping:

image

(same for " ", [ ], [ \t], with either of ? or *)

image

yhirose commented 2 years ago

@kfsone, thanks for more detailed information! I'll take a look at the default error reporting feature to see if I can make any improvements to make error messages more understandable.

yhirose commented 2 years ago

@kfsone, I improve the default error reporting. I actually found some bugs in the code. I really appreciate your contribution. Please let me know if you find any problem. Thanks!

image
yhirose commented 2 years ago

@kfsone, I discovered that this fix actually causes other serious problems. I'll revert back to the original default error recovery for now... Sorry for that...

yhirose commented 2 years ago

I made additional changes to fix regressions.

kfsone commented 2 years ago

@yhirose I would usually have done you the courtesy of a quick look at the changes and tested it locally, but I've been under crunch so I did not get chance to.

Thank you for your diligence!