Dervall / Piglet

An easier to use parsing and lexing tool that is configurable in fluent code without a pre-build step.
binarysculpting.com
MIT License
91 stars 11 forks source link

Unknown escaped character '^' #57

Open harrison314 opened 4 years ago

harrison314 commented 4 years ago

In code

static int Exp(int a , int x)
{
    return Enumerable.Range(0, x).Aggregate(1, (acc, i) => acc * a);
}

var configurator = ParserFactory.Configure<int>();

ITerminal<int> number = configurator.CreateTerminal("\\d+", t => int.Parse(t, System.Globalization.CultureInfo.InvariantCulture));

INonTerminal<int> expr = configurator.CreateNonTerminal();
INonTerminal<int> term = configurator.CreateNonTerminal();
INonTerminal<int> factor = configurator.CreateNonTerminal();
INonTerminal<int> expexpr = configurator.CreateNonTerminal();

expr.AddProduction(expr, "+", term).SetReduceFunction(s => s[0] + s[2]);
expr.AddProduction(expr, "-", term).SetReduceFunction(s => s[0] - s[2]);
expr.AddProduction(term).SetReduceFunction(s => s[0]);

term.AddProduction(term, "*", expexpr).SetReduceFunction(s => s[0] * s[2]);
term.AddProduction(term, "/", expexpr).SetReduceFunction(s => s[0] / s[2]);
term.AddProduction(expexpr).SetReduceFunction(s => s[0]);

expexpr.AddProduction(expexpr, "^", factor).SetReduceFunction(s => Exp(s[0], s[2]));
// expexpr.AddProduction(expexpr, "\\^", factor).SetReduceFunction(s => Exp(s[0], s[2]));
expexpr.AddProduction(factor).SetReduceFunction(s => s[0]);

factor.AddProduction(number).SetReduceFunction(s => s[0]);
factor.AddProduction("(", expr, ")").SetReduceFunction(s => s[1]);

var parser = configurator.CreateParser();

var value = parser.Parse("3^4 + 1");

obtrain error:

Piglet.Lexer.Construction.LexerConstructionException: 'Unknown escaped character '^''

for ^ or \\^.

Is there a mistake in my code?

Dervall commented 4 years ago

I am not exactly sure to be honest. It will work if you put it inside a character class like "[\\^]" instead. I have explicitly made it so, but I dont remember if that was in order to follow a standard or a simple mistake. It was a few years ago by now. :)

It could be altered, so that it would work outside of a character class as well. But using a class should work for you.

harrison314 commented 4 years ago

I'm getting the same error for "[\\^]" , "[^]", Regex.Escape("^").

Unknown6656 commented 4 years ago

Maybe this could be the case because you did not escape + and *?

Try the following:

....

expr.AddProduction(expr, @"\+", term).SetReduceFunction(s => s[0] + s[2]);
expr.AddProduction(expr, "-", term).SetReduceFunction(s => s[0] - s[2]);
expr.AddProduction(term).SetReduceFunction(s => s[0]);

term.AddProduction(term, @"\*", expexpr).SetReduceFunction(s => s[0] * s[2]);
term.AddProduction(term, "/", expexpr).SetReduceFunction(s => s[0] / s[2]);
term.AddProduction(expexpr).SetReduceFunction(s => s[0]);

expexpr.AddProduction(expexpr, "[^]", factor).SetReduceFunction(s => Exp(s[0], s[2]));
expexpr.AddProduction(factor).SetReduceFunction(s => s[0]);

....

Edit: the code above works for me, I hope it does for you as well.

Unknown6656 commented 4 years ago

I am not exactly sure to be honest. It will work if you put it inside a character class like "[\^]" instead.

@Dervall I could maybe try to fix this issue....

harrison314 commented 4 years ago

Edit: the code above works for me, I hope it does for you as well.

The given code still throws the error Unknown escaped character '^'.

I use Piglet 1.5.0 in netcoreapp3.1.

Unknown6656 commented 4 years ago

I use Piglet 1.5.0 in netcoreapp3.1.

Did you clone/download the master-branch of this repository? (Just to make clear, so that I can try to reproduce this error.)

Unknown6656 commented 4 years ago

Nevermind ... I found the possible bug location: https://github.com/Dervall/Piglet/blob/master/Piglet/Parser/Configuration/NonTerminal.cs#L62 This does seem to automatically escape the regex string upon creation of a non-terminal ..... I will investigate this further in the coming days.

Unknown6656 commented 4 years ago

OK, my suspicion noted above was correct, however, it would be wrong to declare this as a bug.

The following is my complete working code sample:

public static void Main(string[] args)
{
    int Exp(int a, int x) => Enumerable.Range(0, x).Aggregate(1, (acc, i) => acc * a);

    var configurator = ParserFactory.Configure<int>();

    configurator.LexerSettings.EscapeLiterals = false;

    ITerminal<int> number = configurator.CreateTerminal("\\d+", t => int.Parse(t, System.Globalization.CultureInfo.InvariantCulture));
    INonTerminal<int> expr = configurator.CreateNonTerminal();
    INonTerminal<int> term = configurator.CreateNonTerminal();
    INonTerminal<int> factor = configurator.CreateNonTerminal();
    INonTerminal<int> expexpr = configurator.CreateNonTerminal();

    expr.AddProduction(expr, "\\+", term).SetReduceFunction(s => s[0] + s[2]);
    expr.AddProduction(expr, "-", term).SetReduceFunction(s => s[0] - s[2]);
    expr.AddProduction(term).SetReduceFunction(s => s[0]);

    term.AddProduction(term, "\\*", expexpr).SetReduceFunction(s => s[0] * s[2]);
    term.AddProduction(term, "/", expexpr).SetReduceFunction(s => s[0] / s[2]);
    term.AddProduction(expexpr).SetReduceFunction(s => s[0]);

    expexpr.AddProduction(expexpr, "^", factor).SetReduceFunction(s => Exp(s[0], s[2]));
    expexpr.AddProduction(factor).SetReduceFunction(s => s[0]);

    factor.AddProduction(number).SetReduceFunction(s => s[0]);
    factor.AddProduction("\\(", expr, "\\)").SetReduceFunction(s => s[1]);

    var parser = configurator.CreateParser();
    var value = parser.Parse("3^4 + 1");

    Console.WriteLine(value);
}

Please do note that I escaped "+", "(", ")", and "*" (not "^", though). Also do note the following line:

configurator.LexerSettings.EscapeLiterals = false;

If the solution above is not adequate for you, I could include ^ to the list of chars inside CharSet Piglet.Lexer.Construction.RegExLexer.EscapedCharToAcceptCharRange(char). However, I am not sure whether this is a breaking change. It should not be, as no exception will be thrown where previously one has been ..... but @Dervall should give his opinion on that matter.

harrison314 commented 4 years ago

The following is my complete working code sample:

This example works for me. Thanks!

Unknown6656 commented 4 years ago

Glad that I could help! :)