yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
879 stars 112 forks source link

.enter/.leave get called twice, but shouldn't. #264

Closed gamagan closed 1 year ago

gamagan commented 1 year ago

Is it normal that parser["RULE"].enter/leave get called twice for one rule that only matches 1 input?

I have a grammer that's rather large. I have a test input that should match 1 rule, 1 time. That rule has an enter/leave and a normal handler. The enter/leave/normal handlers get called correct. Then, afterwards, the enter/leave (but not normal) get called again, with the const char* as empty, and the size_t n param as 0. That behavior, of being called a second time, seems incorrect.

        parser["Rule"].enter = [this](const peg::Context&, const char* s, size_t n, std::any& dt) {
               // Gets called twice, with n=0
        };

       parser["Rule"] = [this](const peg::SemanticValues& vals) {
             // Gets called once
       };

       parser["Rule"].leave = [this](const peg::Context&, const char* s, size_t n, size_t matchLen, std::any& value, std::any& dt) {
           // Gets called twice, with n=0
        };
yhirose commented 1 year ago

@gamagan, thanks for the report. I am not sure of your grammar though, such situation could happen. enter and leave will be called whenever the PEG parser tries to evaluate a rule. Action handlers, however, will be called only when input text matches the rule. Here is an example.

#include <cstdlib>
#include <iostream>
#include <peglib.h>

using namespace peg;

int main(void) {
  parser parser(R"(
    S <- A+
    A <- 'A'
  )");

  parser["A"].enter = [](const Context &c, const char *s, size_t n,
                         std::any &dt) { std::cout << "enter" << std::endl; };

  parser["A"] = [](const SemanticValues &vs, std::any &dt) {
    std::cout << "action!" << std::endl;
  };

  parser["A"].leave = [](const Context &c, const char *s, size_t n,
                         size_t matchlen, std::any &value,
                         std::any &dt) { std::cout << "leave" << std::endl; };

  if (parser.parse("A")) { return 0; }

  std::cout << "syntax error..." << std::endl;
  return -1;
}

If we run the above code, you will get the following result.

enter
action!
leave
enter
leave

In this case, the parser tries to evaluate the rule A twice, but only the first attempt has been accepted. That's why the action handler gets called only once.

Hope it helps!

gamagan commented 1 year ago

So, is the correct way to deal with this situation to check for n==0 to see if it's a proper match?

yhirose commented 1 year ago

Your solution only works when the rule is evaluated at the end of text input. There is absolutely no way to detect if the text input matches in enter since the text matching process takes place AFTER enter gets called.