yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
900 stars 112 forks source link

Proposal to add trace level option #194

Closed mingodad closed 2 years ago

mingodad commented 2 years ago

Even for a small input the output of trace can be big, I'm proposing to implement a way to trim down the output, as a prove of concept show bellow I'm accepting an optional integer value after the --trace option and using it to only show output with level <= opt_trace_level, ideally we would run it first with low trace level and will grow as needed depending on what we see in the reduced trace output, an enhanced version could also accept optional mim_position and or max_position to allow better control of the output.

diff --git a/lint/peglint.cc b/lint/peglint.cc
index 63d16cd..539c33e 100644
--- a/lint/peglint.cc
+++ b/lint/peglint.cc
@@ -41,6 +41,7 @@ int main(int argc, const char **argv) {
   auto opt_source = false;
   vector<char> source;
   auto opt_trace = false;
+  size_t opt_trace_level = 0;
   vector<const char *> path_list;

   auto argi = 1;
@@ -66,6 +67,27 @@ int main(int argc, const char **argv) {
       }
     } else if (string("--trace") == arg) {
       opt_trace = true;
+      if ((argi) < argc) {
+        std::string text = argv[argi];
+   cerr << "opt_trace_level1 = " << text << "\n";
+        int ival = 0;
+        size_t ipos = text.size();
+        while(ipos) {
+            int ch = text.at(--ipos);
+            if(isdigit(ch)) {
+       ival = ival * 10 + (ch - '0');
+            }
+            else {
+           ipos = 1; //used as error flag
+           break;
+       }
+        }
+        if(ipos == 0 && ival > 0) {
+            opt_trace_level = ival;
+            ++argi;
+       cerr << "opt_trace_level2 = " << opt_trace_level << "\n";
+        }
+      }
     } else {
       path_list.push_back(arg);
     }
@@ -133,18 +155,21 @@ int main(int argc, const char **argv) {
           auto backtrack = (pos < prev_pos ? "*" : "");
           string indent;
           auto level = c.trace_ids.size() - 1;
-          while (level--) {
-            indent += "│";
-          }
-          std::string name;
-          {
-            name = peg::TraceOpeName::get(const_cast<peg::Ope &>(ope));
-
-            auto lit = dynamic_cast<const peg::LiteralString *>(&ope);
-            if (lit) { name += " '" + peg::escape_characters(lit->lit_) + "'"; }
-          }
-          std::cout << "E " << pos << backtrack << "\t" << indent << "┌" << name
-                    << " #" << c.trace_ids.back() << std::endl;
+     if(opt_trace_level >= level || opt_trace_level == 0)
+     {
+         while (level--) {
+           indent += "│";
+         }
+         std::string name;
+         {
+           name = peg::TraceOpeName::get(const_cast<peg::Ope &>(ope));
+
+           auto lit = dynamic_cast<const peg::LiteralString *>(&ope);
+           if (lit) { name += " '" + peg::escape_characters(lit->lit_) + "'"; }
+         }
+         std::cout << "E " << pos << backtrack << "\t" << indent << "┌" << name
+               << " #" << c.trace_ids.back() << std::endl;
+     }
           prev_pos = static_cast<size_t>(pos);
         },
         [&](const peg::Ope &ope, const char *s, size_t /*n*/,
@@ -154,28 +179,31 @@ int main(int argc, const char **argv) {
           if (len != static_cast<size_t>(-1)) { pos += len; }
           string indent;
           auto level = c.trace_ids.size() - 1;
-          while (level--) {
-            indent += "│";
-          }
-          auto ret = len != static_cast<size_t>(-1) ? "└o " : "└x ";
-          auto name = peg::TraceOpeName::get(const_cast<peg::Ope &>(ope));
-          std::stringstream choice;
-          if (sv.choice_count() > 0) {
-            choice << " " << sv.choice() << "/" << sv.choice_count();
-          }
-          std::string token;
-          if (!sv.tokens.empty()) {
-            token += ", token '";
-            token += sv.tokens[0];
-            token += "'";
-          }
-          std::string matched;
-          if (peg::success(len) &&
-              peg::TokenChecker::is_token(const_cast<peg::Ope &>(ope))) {
-            matched = ", match '" + peg::escape_characters(s, len) + "'";
+     if(opt_trace_level >= level || opt_trace_level == 0)
+     {
+         while (level--) {
+           indent += "│";
+         }
+         auto ret = len != static_cast<size_t>(-1) ? "└o " : "└x ";
+         auto name = peg::TraceOpeName::get(const_cast<peg::Ope &>(ope));
+         std::stringstream choice;
+         if (sv.choice_count() > 0) {
+           choice << " " << sv.choice() << "/" << sv.choice_count();
+         }
+         std::string token;
+         if (!sv.tokens.empty()) {
+           token += ", token '";
+           token += sv.tokens[0];
+           token += "'";
+         }
+         std::string matched;
+         if (peg::success(len) &&
+             peg::TokenChecker::is_token(const_cast<peg::Ope &>(ope))) {
+           matched = ", match '" + peg::escape_characters(s, len) + "'";
+         }
+         std::cout << "L " << pos << "\t" << indent << ret << name << " #"
+               << c.trace_ids.back() << choice.str() << token << matched << std::endl;
           }
-          std::cout << "L " << pos << "\t" << indent << ret << name << " #"
-                    << c.trace_ids.back() << choice.str() << token << matched << std::endl;
         });
   }

Example of parsing the snippet bellow using the peglib grammar shown here https://github.com/yhirose/cpp-peglib/issues/193#issuecomment-1114776706 to parse it:

IdentStart <-  [\u0080-\uFFFF]

With the actual --trace option we get 992 lines of output, with an optional trace level of 4 we get only 22 lines:

./peglint --trace 4 cpp-peglib.peg test.peg 
opt_trace_level1 = 4
opt_trace_level2 = 4
E 0 ┌[Grammar] #0
E 0 │┌Sequence #1
E 0 ││┌[Spacing] #2
E 0 │││┌Repetition #3
E 0 ││││┌PrioritizedChoice #4
L 0 ││││└x PrioritizedChoice #4
L 0 │││└o Repetition #3
L 0 ││└o [Spacing] #2
E 0 ││┌Repetition #17
E 0 │││┌[Definition] #18
E 0 ││││┌PrioritizedChoice #19
L 30    ││││└o PrioritizedChoice #19 1/2
L 30    │││└o [Definition] #18
L 30    ││└o Repetition #17
E 30    ││┌[EndOfFile] #493
E 30    │││┌NotPredicate #494
E 30    ││││┌AnyCharacter #495
L 30    ││││└x AnyCharacter #495
L 30    │││└o NotPredicate #494, match ''
L 30    ││└o [EndOfFile] #493, match ''
L 30    │└o Sequence #1
L 30    └o [Grammar] #0
mingodad commented 2 years ago

Another interesting tool is https://github.com/mqnc/pegdebug but it stoped updating some time ago and doesn't works with the actual peglib it uses .enter, .match and .leave handlers and uses a call to any->get that doesn't exists anymore for indentation and looking at peglint that uses .trace_enter and trace_leave and get the indentation from Context can someone with more knowledge of the internals of peglib update pegdebug ?

yhirose commented 2 years ago

Thank you for the suggestion, but I don't need the feature at this point for my project. Could you send a pull request, so that I can review the code and determine if it's worth to be merged. Thanks.