datalust / superpower

A C# parser construction toolkit with high-quality error reporting
Apache License 2.0
1.05k stars 98 forks source link

Tokenizer and Parser Assistance #152

Closed martinkoslof closed 1 year ago

martinkoslof commented 1 year ago

I opened up a previous ticket about moving from Sprache to Superpower. In my continued efforts to figure out that problem, I figured, I probably need to create a Tokenizer and do my external validation there. After that, I would build the expressions out. This is what I have so far:

Tokenizer/Builder: (I've simplified this down greatly just for explanation purposes)

   var tokenizer = new TokenizerBuilder<QueryToken>()
                .Ignore(Span.WhiteSpace)
                .Match(Character.EqualTo('('), QueryToken.RParen)
                .Match(Character.EqualTo(')'), QueryToken.LParen)
                .Match(Span.EqualTo("startswith"), QueryToken.StartsWith)
                .Match(Span.EqualTo("endswith"), QueryToken.EndsWith)
                .Match(Span.EqualTo("contains"), QueryToken.Contains)
                .Match(Character.Letter.IgnoreThen(Character.LetterOrDigit.AtLeastOnce()), QueryToken.Field, requireDelimiters: true)
                .Match(String, QueryToken.Text)
                .Match(Character.EqualTo(','), QueryToken.Comma)
                .Build();

            return tokenizer.Tokenize(filter);

The QueryToken enum (also simplified down greatly just for explanation purposes)

    public enum PatchQueryToken
    {

        [Token(Category = "delimiter", Example = ",")]
        Comma,

        [Token(Category = "field", Example = "Name")]
        Field,

        [Token(Category = "value", Example = "'Foo'")]
        Text,

        [Token(Category = "function", Example = "startswith")]
        StartsWith,

        [Token(Category = "function", Example = "endswith")]
        EndsWith,

        [Token(Category = "function", Example = "contains")]
        Contains,

        [Token(Example = "(")]
        LParen,

        [Token(Example = ")")]
        RParen
    }

The TokenParser method (I need to chain together others but this targets in on my problem)

  public static readonly TokenListParser<PatchQueryToken, Expression> Contains =
           from prefix in Token.EqualTo(PatchQueryToken.Contains)
           from lparen in Token.EqualTo(PatchQueryToken.LParen)
           from field in Token.EqualTo(PatchQueryToken.Field)
           from comma in Token.EqualTo(PatchQueryToken.Comma)
           from value in Token.EqualTo(PatchQueryToken.Text)
           from rparen in Token.EqualTo(PatchQueryToken.RParen)
           select CallFunction(prefix.ToStringValue(), field.ToStringValue(), value.ToStringValue());

The CallFunction method just creates the "contains" startswith function via Expression.Call (the code never gets this far). I am clearly not understanding something but I believe this would parse "Contains Token -> Then open paren -> then the field -> then a comma -> then the text value -> then a closed paren"

My simple test:

var filter = "contains(Name, 'velop')";

var tokenizer = new PatchQueryTokenizer();
var tokens = tokenizer1.Tokenize(filter);

var why = PatchQueryTokenParser<Data>.Contains(tok1); //parser error {Syntax error (line 1, column 9): unexpected `(`, expected `(`.}

Here is the output I see when I print the tokens or the Tokenize return, it looks correct to me (this is the Token Kind + "-" + The Token String value)

 Contains - contains
RParen - (
Field - Name
Comma - ,
Text - 'velop'
LParen - )

I am getting a fairly cryptic error, saying '(' is unexpected but '(' is expected. I must be missing something small and probably obvious to others but I'm totally baffled at this point and don't know how to unblock myself.

Can anyone tell me what I'm doing wrong here?

martinkoslof commented 1 year ago

Bleh! Nevermind. I see my error, my tokenizer for parens is backwards :/. Closing this.

nblumhardt commented 1 year ago

Great! Glad you were able to track it down :+1:

martinkoslof commented 1 year ago

Thanks. I have another issue I'll open up :)