GerHobbelt / jison

bison / YACC / LEX in JavaScript (LALR(1), SLR(1), etc. lexer/parser generator)
https://gerhobbelt.github.io/jison/
MIT License
118 stars 20 forks source link

Doesn't seem to pick longest possible match #62

Closed palantus closed 3 years ago

palantus commented 3 years ago

Hi

I have a problem while trying to create a parser for a programming language. Whenever I use eg. variables, enumeration values etc. with names like "dog", it will be matched to "do" (as in do-while). I've created this minimal example:

%lex

%%

"do"                        return 'DO';
[a-zA-Z_][a-zA-Z0-9_]*      return 'ID';
"::"                        return 'DOUBLECOLON'
<<EOF>>                     return 'ENDOFFILE';

/lex

%%

start
    : ID DOUBLECOLON ID ENDOFFILE
    {$$ = {type: "enumval", enum: $1, val: $3}}
    ;

When I use that to parse PetTypes::dog, I get the following error:

JisonParserError: Parse error on line 1:
PetTypes::dog
----------^
Expecting "ID", got unexpected DO

As far as I understand, it should match ID according to the flex pattern matching rules, because that is the longest possible match.

I just tried using the original jison npm package and I don't get any error with that. It only (as expected) complains if I parse PetTypes::do.

Is this a bug or am I missing something?

palantus commented 3 years ago

Oops, was not aware of the flex and easy_keyword_rules options. Just ignore this issue and refer to the answer here for more information, if you encounter the same problem :)

https://stackoverflow.com/questions/65581744/jison-how-do-i-avoid-dog-being-parsed-as-do