GerHobbelt / jison

bison / YACC / LEX in JavaScript (LALR(1), SLR(1), etc. lexer/parser generator)
https://gerhobbelt.github.io/jison/
MIT License
118 stars 20 forks source link

Longest match isn't respect with literal strings (differs from zaach/jison) #45

Closed seanlaff closed 5 years ago

seanlaff commented 5 years ago

I think I'm noticing jison being too greedy with lex grammars that contain string literals.

Here's a tiny grammar that allows you to and/or strings together, returning a binary tree.

/* lexical grammar */
%lex
%%

<<EOF>>          return 'EOF'
\s                       /* skip whitespace */
"and"                return 'AND'
"or"                   return 'OR'
[a-z]*                return 'WORD'

/lex

/* operator associations and precedence */

%left 'AND'
%left 'OR'

%start expressions

%% /* language grammar */

expressions
    : e EOF
        {return $1;}
    ;

e
    : e AND e
        {$$ = {type:"and", lft: $1, rgt:$3};}
    | e OR e
        {$$ = {type:"or", lft: $1, rgt:$3};}
    | WORD
        {$$ = $1}
    ;

When I try a query like paul and andre, I get a parser error (it tries to read the and... of andre as an AND node, rather than a WORD node.

However, when I use the same grammar in the original jison, it parses as expected.

seanlaff commented 5 years ago

Aha, found out there is documentation around this

https://github.com/zaach/jison/wiki/Deviations-From-Flex-Bison#user-content-literal-tokens

%options easy_keyword_rules did the trick for me. Thanks!