yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
900 stars 112 forks source link

Feature request: hints for AST optimization in the grammar #137

Closed yesint closed 3 years ago

yesint commented 3 years ago

The problem

Currently AST optimizer accepts a list of rule names to optimize or to exclude from optimization. The drawback of such approach is that the grammar itself is not enough to grasp how the resulting AST will look like. During developing the grammar one have to keep the list of rules for optimizer separately. This is error-prone since it's very easy to forget keeping this list of rules up to date.

Suggestion

Add an optimizer hint to the grammar itself: For example

! EXPR <- 'property' VALUE
VALUE <- FLOAT

The "!" tells that the VALUE rule should not be optimized. The logic is the same as for "~", which tells to exclude the rule from the AST.

This could even be generalized to have user-defined hints for AST processing. Something like:

hint1: EXPR <- 'property' VALUE
hint2: VALUE <- FLOAT

"hint1" and "hint2" will than be accessible for AstOptimizer and will mark specific categories of nodes for dedicated processing.

I'm not sure how useful such custom categories are, but the pre-defined hint for exclusion from AST optimization is definitely very useful.

yhirose commented 3 years ago

@yesint, thank you for your suggestion. But I feel that users might be confused with ! because it's already used for Not-predicate. Also I intentionally limit the use of enhanced notations like ~ and < only for PEG parsing itself and put any non-PEG parsing related things into 'instruction' block { ... }.

Since the AST feature is out of the core PEG parsing feature, I would not like to include such an annotation in the main PEG grammar. But since the AST generation is a pretty common feature in cpp-peglib, I am totally ok to make it easier to use the feature. I am thinking to introduce another instruction keyword for this purpose. May be something like this?

EXPR <- 'property' VALUE { ast_opt_exclude }

It's a bit more verbose than ! though, I think the instruction explains its intention more clearly and keep this feature separated from the core PEG parsing. I'll ponder over this enhancement more carefully, and get back to you.

When it comes to 'hints', it sounds an interesting idea. I'll keep the idea in mind. Thanks!

yesint commented 3 years ago

@yhirose Thank you for considering this feature! Your solution with instruction block is even better, imho. The name could be a bit shorter, like no_ast_opt.

yhirose commented 3 years ago

@yesint, I have implemented it. Also I introduced optimize_ast method on peg::parser class. Here is the usage.

peg::parser parser(R"(
  ...
  defenition1 <- ... { no_ast_opt }
  defenition2 <- ... { no_ast_opt }
  ...
)");

parser.enable_ast();

shared_ptr<peg::Ast> ast;
if (parser.parse("...", ast)) {
  cout << peg::ast_to_s(ast);

  ast = parser.optimize_ast(ast);
  cout << peg::ast_to_s(ast);
}

Hope you like it!

yesint commented 3 years ago

@yhirose Great! Thanks a lot for implementing it!