Closed micrenda closed 3 months ago
Added PR https://github.com/yhirose/cpp-peglib/pull/305 which implement this feature: it may need some rework.
@micrenda thanks for the feedback, but I don't understand the example grammar... The grammar isn't valid. ('excitatopm' is not defined, and 'excitation' is not referenced.) So pegParser.load_grammar(s);
doesn't work due to the incorrect grammar. cpp-peglib doesn't allow such incorrect grammar...
Hello
In the example I wrote I just omitted the actual implementation, because it was not important (and I also made a typos!). Let me give you a valid grammar:
species <- molecule ( ' ' '(' excitation ')' )?
molecule <- ([A-Z] [a-z]? [0-9]?)+
excitation <- excitation_ele / excitation_vib / excitation_rot
excitation_ele <- 'A' / 'B' / 'C'
excitation_vib <- [0-9]* 'V' [0-9]+
excitation_rot <- 'J' [0-9]+
In my code, now I can do something like this:
pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("H2O (2V1)", result);
And it will work perfectly.
However, using the PR https://github.com/yhirose/cpp-peglib/pull/305, it is now possible to also do this in unit testing or in other section of code:
pegParser = peg::parser();
pegParser.load_grammar(s);
std::any result;
pegParser.parse("2V1", result, nullptr, "excitation_vib");
For me this is a life saver :-)
Thanks for the clear explanation. I now fully understand what you would like to do. (By the way, I put comments in your pull request to fix problems that I found, and the following sample uses the revised version.)
Unfortunately, there are some situations where the parser doesn't work properly with this. %whitespace
feature is one of them.
// sample.cc
#include <iostream>
#include <peglib.h>
using namespace peg;
int main(void) {
parser parser(R"(
Start <- A
A <- B (',' B)*
B <- '[one]' / '[two]'
%whitespace <- [ \t\n]*
)");
std::cout << std::boolalpha;
std::cout << parser.parse("[one],[two]") << std::endl;
std::cout << parser.parse(" [one] , [two] ") << std::endl;
std::cout << parser.parse("[one],[two]", nullptr, "A") << std::endl;
std::cout << parser.parse(" [one] , [two] ", nullptr, "A") << std::endl;
}
> ./sample
true
true
true
false
As you can see, %whitespace
only works with Start
. It's because cpp-peglib applies some special treatments only to the start rule. You can see what are added to the start rule in perform_core
function.
https://github.com/yhirose/cpp-peglib/blob/5ef7180a12f305ac92fad73efb4d9a7b81e5b980/peglib.h#L3992
@micrenda I made a change to allow users to specify the start definition rule name in the parser constructor and load_grammar
method at #306. (Unfortunately, we cannot do the same in parse
method because of the reason I explained in the above comment. But hope this pull request can satisfy your needs.)
auto grammar = R"(
Start <- A
A <- B (',' B)*
B <- '[one]' / '[two]'
%whitespace <- [ \t\n]*
)";
peg::parser parser(grammar, "A"); // Start Rule is "A"
or
peg::parser parser;
parser.load_grammar(grammar, "A"); // Start Rule is "A"
parser.parse(" [one] , [two] "); // OK
Could you take a look at it when you have time? Thanks!
I would like to ask if it is possible to pass a specific target rule instead of using the main priority chain when parsing a string.
Let me clarify with an example:
Suppose I have the following rule set:
Usually, in my code, I would do something like this:
This works fine. However, in my unit tests or in other parts of the code, I might want to parse according to a specific rule. In that case, I would like to do something like this:
This way, I would use excitation_vib as the root rule and expect an exception if excitation_vib does not fully consume the input.
Is this possible? With the current implementation, to achieve something like this, I would need to change the grammar by making the target rule the new root. However, I was wondering if there is a better way to do it.