picoe / Eto.Parse

Recursive descent LL(k) parser for .NET with Fluent API, BNF, EBNF and Gold Grammars
MIT License
148 stars 30 forks source link

AlternativeParser doesn't work in one case, but works if I switch the order #64

Closed papler closed 8 months ago

papler commented 8 months ago

thisWayIsBroken = wsMandatory | ( wsOptional & new CharSetTerminal( ':', '=' ) & wsOptional ); butThisWayWorksOK = ( wsOptional & new CharSetTerminal( ':', '=' ) & wsOptional ) | wsMandatory;

This is for a name-value parser, where the name and value can be separated by either just spaces, or a colon or an equal sign. In case of colon or equal sign, spaces in between the name and colon/equalsign and value are optional.

    //All these examples should be valid
    List<string> examples = new List<string>() {

                " ",
                 "   ",
                ":",
                " :",
                ":  ",
                 "  :  ",
                 "=",
                " =",
                "=  ",
                 "  =  "
    };

    var wsMandatory = new RepeatCharTerminal( char.IsWhiteSpace, 1 );
    var wsOptional = new RepeatCharTerminal( char.IsWhiteSpace, 0 );

    var thisWayIsBroken = wsMandatory |  ( wsOptional & new CharSetTerminal( ':', '=' ) & wsOptional );

    var g = new Grammar();
    g.Inner = thisWayIsBroken;
    foreach( var e in examples )
    {
                var match = g.Match( e );
                Console.WriteLine( "{0}   \"{1}\"", ( match.Success ? "ok  " : "FAIL" ), e );
    }

    var butThisWayWorksOK =  ( wsOptional & new CharSetTerminal( ':', '=' ) & wsOptional ) | wsMandatory;

All of the examples should be passing, but I get:

ok     " "
ok     "   "
ok     ":"
FAIL   " :"
ok     ":  "
FAIL   "  :  "
ok     "="
FAIL   " ="
ok     "=  "
FAIL   "  =  "
cwensley commented 8 months ago

Hi @papler,

Thanks for submitting the issue. This is a known behaviour of the alternative parser and a way recursive descent parsers generally work. The first match will be the one it uses, so you have to order the alternatives in a way so that the "longest match" comes first.

So, since wsMandatory matches the ones that start with a space, it won't even bother trying the other alternative.

papler commented 8 months ago

Thanks for the reply. I strongly considered that this is just me not using it right, since it's such a mature library.

So once it matches some mandatory whitespace, it then doesn't try the alternative, but goes ahead with the "=" or ":". However, doesn't it backtrack? This alternative is unsuccessful - try other ones?

cwensley commented 8 months ago

The problem is that it is successful and reached the end of your parser grammar. Eto.Parse is not backtracking in any way currently.

papler commented 8 months ago

Ah ok, I'll continue writing this grammar with that in mind. Thank you!