yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
900 stars 112 forks source link

Proposal for Error Message Macro Support #177

Closed stoneRdev closed 2 years ago

stoneRdev commented 3 years ago

Hello, I'd like to propose an addition to your library. To start with, error messaging has been immensely useful. However, I believe I've found a shortcoming that could be (possibly?) trivially fixed.

I'd like to propose macro enabled error messages, something like:

ERROR_M(T) { message "Unexpected %1"}
ERROR_MM(T,T) {message "Unexpected %1, expected %2"}

Where %1 and %2 can be a captures to the arguments. The behavior of the arguments (in my best opinion) should be like so: if the argument is a string literal, use the string e.g. taken from above: ERROR_M("abc") would result in a message of "Unexpected abc" if the argument is a rule name, use the name of the rule e.g. taken from above: ERROR_M(FOR_KEYWORD) would result in a message of "Unexpected FOR_KEYWORD"

This would be nice to have in situations like this:

FOR_ERROR(T,TT) { message "%1 missing in %2 statement"}
FOR_IN <- FOR_KEYWORD  "("  LEFT_EXPRESSION^FOR_ERROR(LEFT_EXPRESSION,FOR_IN) IN_KEYWORD  EXPRESSION  ")"  STATEMENT
FOR_OF <- FOR_KEYWORD  "("  LEFT_EXPRESSION^FOR_ERROR(LEFT_EXPRESSION,FOR_OF)  OF_KEYWORD  EXPRESSION  ")"  STATEMENT

Where normally one would have to write an error label for each error they expect to encounter, following the example above:

FOR_IN_LEFT_EXPRESSION_ERROR { message "LEFT_EXPRESSION missing in FOR_IN statement"}
FOR_OF_LEFT_EXPRESSION_ERROR { message "LEFT_EXPRESSION missing in FOR_OF statement"}
FOR_IN <- FOR_KEYWORD  "("  LEFT_EXPRESSION^FOR_IN_LEFT_EXPRESSION_ERROR IN_KEYWORD  EXPRESSION  ")"  STATEMENT
FOR_OF <- FOR_KEYWORD  "("  LEFT_EXPRESSION^FOR_OF_LEFT_EXPRESSION_ERROR  OF_KEYWORD  EXPRESSION  ")"  STATEMENT

While this is a trivial example, I currently have a grammar with 47 error messages, and I've determined I can reduce that to 14 marcos if this feature is included.

Not only will this feature make grammars shorter and easier to read, but I believe this feature would also improve a user experience by making error messages more dynamic while maintaining backwards compatibility.

Thank you for your time regarding my request and thank you for the time and good work you've put into this library

yhirose commented 3 years ago

@stoneRdev, thank you for the interesting idea to make the error message handling easier. I actually would not like to introduce any special syntax which could be confused with existing PEG syntax. But I understand what you intend to do with the proposal.

I'll try to come up with a way to do the similar with just action block { ... } where we can do whatever things we want without touching existing PEG syntax. Thanks for your comment!

stoneRdev commented 3 years ago

@yhirose Thank you for your response! I completely understand not wanting to change the syntax. I also like the idea of an action-only way of handling this.

While I looked over the source to see if I could help, I don't have the time to familiarize myself with the internals of the library, but from my quick glance, maybe adding inside ErrorInfo::output_log a replace call for %e being the expected token and %m being the token that failed the match.

I'm not sure where the information for those two values live, but I'm sure they have to be somewhere as the parser already knows the token it expects next (maybe a little convoluted, but maybe %e can be an array, %e printing a comma list of all expected tokens and %e0 to %e{number} could get the expected tokens in order). As for %m being the matched token, I'm sure that's possible as well since the parser knows what token it's on.

If you're able to tell me where those two data points live at, I could possibly make the change myself and submit a pull and save you some work.

In the mean time, I'll try to dig through the source and find the location of those two data points myself and see if I can save you the work altogether

yhirose commented 2 years ago

I close it for now since I am not planning to implement it.