yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
916 stars 113 forks source link

Use named capture in error message? #289

Open kfsone opened 8 months ago

kfsone commented 8 months ago

Consider the following which declares an enum grammar that wants to enforce a minimum requirement on the number of members. This works,

Enums <- Enum*
%whitespace <- [ \t\r\n]*
%word <- [A-Za-z0-9_.]+

NAME <- < [A-Za-z_][A-Za-z0-9_]* >

Enum <- 'enum' ↑ $name<NAME> '{' ↑ NAME+^enum_count '}'

enum_count <- '' { error_message "enum must contain at least one member" }

but it is inelegant: image

It would be really nice if we could use the named capture in error_message. Some thoughts:

TypeDef <- $^type<'type'> ↑ $^name<NAME> Parent? '{' Members '}'
Members <- Member (',' Member)* ','?
Member <- $^'member' $type<NAME> $^field<NAME> MemberDefault
MemberDefault <- '{' '}'  # imagine there's some complex recursion here

Now this has the capability automatically take:

type Connection { Status state {;} }

and report

1:33: type->Connection->member->state: unexpected ';', expecting '}'

Or perhaps rather than the ugly '$^'member' you could allow '^' infront of the production name to indicate it should push its name onto the context stack:

^Type <- 'type' ↑ $^name<NAME> Parent? '{' Members '}'
Members <- Member (',' Member)* ','?
^Member <- $type<NAME> $^field<NAME> DefaultValue
^DefaultValue <- '{' '}'
1:33: Type->Connection->Member->state->DefaultValue: unexpected ';', expecting '}'

If we tighten this down to one $^ per production, you can design the format so that, in rust terms:

match captures {
  Some(production), Some(capture) => { format!("{production} '{capture}': ") },
  Some(production), None => { format!("{production}: ") },
  None, Some(capture) => { format!("'{capture}': ") },
  None, None => { "" }
}
# given: `a b`
^JustType <- 'a' 'b' 'c'  # error: JustType: expecting 'c'
^JustCap <- 'a' $^name<'b'> 'c'  # error: 'b': expecting 'c'
^Both <- 'a' $^name<'b'> 'c'  # error: Both 'b': expecting 'c'
Neither <- 'a' 'b' 'c'  # error: expecting 'c'
yhirose commented 8 months ago

@kfsone thanks for the ideas. I am not planing to implement any of these at this point, but keep it as 'enhancement'.