yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
884 stars 112 forks source link

added ability to update the grammar in a parser #152

Closed mqnc closed 3 years ago

mqnc commented 3 years ago

Hi Yuji!

I finally took the time to understand peglib more deeply and it's really great architecture! For my project about the parser that can update itself, I modified peglib a bit. The perform_core method inside ParserGenerator now optionally accepts a previously defined grammar and extends it. The parser class uses this inside update_grammar.

I mainly created a pull request so you can see the changes, please don't feel obliged to merge this. This is still a prototype, it lacks testing and I don't know if I overlooked some aspects that make extending an existing grammar more difficult.

I mainly wanted to know whether you would be open for including this functionality once we are sure it works or if I should rather inherit from ParserGenerator in my own project and use peglib as it is.

Furthermore, I was wondering if we can connect on LinkedIn or similar. I have become really interested in parsing and compiling and I have few people to discuss it with. But I also understand if you have no time or you are not interested.

I have appended a little test so you can see the update in action:

#include "peglib.h"
#include <assert.h>
#include <iostream>
#include <cmath>

using namespace peg;
using namespace std;

int main(void) {

    parser parser(R"(
    # Grammar for Calculator...
    Additive    <- Multitive '+' Additive / Multitive
    Multitive   <- Primary '*' Multitive / Primary
    Primary     <- '(' Additive ')' / Number
    Number      <- < [0-9]+ >
    %whitespace <- [ \t]*
    )");

    assert(static_cast<bool>(parser) == true);

    parser["Additive"] = [](const SemanticValues& vs) {
        switch (vs.choice()) {
        case 0: // "Multitive '+' Additive"
            return any_cast<int>(vs[0]) + any_cast<int>(vs[1]);
        default: // "Multitive"
            return any_cast<int>(vs[0]);
        }
    };

    parser["Multitive"] = [](const SemanticValues& vs) {
        switch (vs.choice()) {
        case 0: // "Primary '*' Multitive"
            return any_cast<int>(vs[0]) * any_cast<int>(vs[1]);
        default: // "Primary"
            return any_cast<int>(vs[0]);
        }
    };

    parser["Number"] = [](const SemanticValues& vs) {
        return vs.token_to_number<int>();
    };

    parser.update_grammar(R"(
    Multitive   <- Exponent '*' Multitive / Exponent
    Exponent    <- Primary '^' Primary / Primary
    )");

    parser["Exponent"] = [](const SemanticValues& vs) {
        switch (vs.choice()) {
        case 0: // Primary '^' Primary
            return int(pow(any_cast<int>(vs[0]), any_cast<int>(vs[1])));
        default: // Primary
            return any_cast<int>(vs[0]);
        }
    };

    int val;
    parser.parse(" (1 + 2) ^ 3 ", val);

    std::cout << val << "\n";
}

Cheers!

yhirose commented 3 years ago

@mqnc, thank you for the pull request. Here is my answers to your questions:

I mainly created a pull request so you can see the changes, please don't feel obliged to merge this. This is still a prototype, it lacks testing and I don't know if I overlooked some aspects that make extending an existing grammar more difficult.

I found two things. What about some definitions no longer used. For instance,

Original:

A <- B
B <- 'b'

Update:

A <- C
C <- 'c'

Then, B becomes an orphaned rule...

Another thing is that it's very dangerous and will cause unpredicted behaviors if a user accidentally calls update_grammar in any action handler. It's more obvious when a user applies the packrat mode.

Since I don't fully understand what this feature tries to do, the above my comments might not be correct though...

I mainly wanted to know whether you would be open for including this functionality once we are sure it works or if I should rather inherit from ParserGenerator in my own project and use peglib as it is.

I checked the sample, but I don't come up with any situation where I need this kind of feature yet... Are there any real world examples that people are dealing with?

If it's a common thing and can benefit a log of users, I don't mind including it. But we need to remember that any addition to the code base will end up increasing my maintenance cost. That's why I am very careful to add any new features that seem to be useful, and I leave only features absolutely necessary. I have already got rid of some existing features in cpp-peglib that were useful, but not so important. (For example, I removed like this. It was a pretty neat feature, but I thought we can live without it.)

Furthermore, I was wondering if we can connect on LinkedIn or similar. I have become really interested in parsing and compiling and I have few people to discuss it with. But I also understand if you have no time or you are not interested.

I am comfortable with discussing matters related to cpp-peglib here in GitHub, because all the conversations are recored in one place here.

Thank you!

mqnc commented 3 years ago

You are probably right, this will create more maintenance work and complications than being useful for people, it is mainly for my niche case which is also just an experiment. I specifically want to use this to call it from an action handler.

I will probably just copy and modify the parser generator then.

Allright, I will create issues for discussions then. Maybe you can close discussion issues right away because they are not really issues. It's a shame that github doesn't have an extra section for this.