yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
880 stars 112 forks source link

%t doesn't (always) show the unexpected token #238

Closed kfsone closed 2 years ago

kfsone commented 2 years ago

Given:

program <- enum+

# treat any non-punctuated sequence of alpha/digits/underscores as a word,
%word <- [a-zA-Z0-9_]+

~space <- [ \r\t\n]

enum <- ~enum_keyword space+ enum_kind

enum_keyword <- "enum"

## VARIANT 1:
#enum_kind <- ( sequence / bitmask / %recover(untyped_enum) )
#sequence <- "sequence"
#bitmask <- "bitmask"

## VARIANT 2:
#enum_kind <- ( "sequence" / "bitmask" / %recover(untyped_enum) )

untyped_enum <- '' { message "invalid/missing enum type, expected one of 'sequence' or 'bitmask', got '%t'"}

the input enum sequencer does not report got 'sequencer', instead it reports got 'sequence'.

image

image

edit: maybe this is a size constraint issue? if I gave it a string shorter than the first/longest candidate:

image

or this oddity (the input is 'bitmaskerror' but %t prints 'bitmaske')

image

edit: yes, this appears to be the problem:

image

yhirose commented 2 years ago

Thanks for the report!