arithy / packcc

A parser generator for C
Other
347 stars 28 forks source link

Proposal: visitor pattern generation/event interface? #81

Open ethindp opened 3 months ago

ethindp commented 3 months ago

Not sure if this would be a good feature or not, but it would aid in AST generation, I think.

Right now generating an AST looks fairly complicated to me. I was thinking that one way to make this easier would be to have a codegen mode where it would generate a set of functions for each rule. The functions would take either an argument containing all of the subrules referenced in that rule (or the raw text if it referred to a token) or would have a list of arguments, with each argument being a referenced subrule or token, and that would mean that all one would need to do to build an AST would be to keep track of where their at on a stack. I'm thinking of something like the following:

Assume we have this rule:

subtype_declaration <- 
   kw_subtype  defining_identifier  kw_is  subtype_indication 
(aspect_specification)?  semicolon 

If we were to implement a C++ mode (which would be nice but I can see how that'd be a hassle), it might do something like either:

struct subtype_declaration {
    kw_subtype kw_subtype;
    defining_identifier defining_identifier;
    kw_is kw_is;
    subtype_indication subtype_indication;
    std::optional<aspect_specification> aspect_specification;
    semicolon semicolon;
};

Or, as an alternative, the struct would be destructured and passed as individual function arguments. For rules that have alternations, that could be a union. Things like that.

The idea would be that we would "subscribe" to the parser "events", if you will, and then you could just build your AST that way. I believe ANTLR4 does this, as does the Heim parser generator, though I don't use either because ANTLR4 doesn't support this grammar and Heim broke last time I tried. This is the only parser tool that I've found that supports the kind of grammar I'm working with.