GerHobbelt / jison

bison / YACC / LEX in JavaScript (LALR(1), SLR(1), etc. lexer/parser generator)
https://gerhobbelt.github.io/jison/
MIT License
118 stars 20 forks source link

lexer rules for jison (ebnf-parser and lex-parser modules) b0rk very late on unterminated string #13

Closed GerHobbelt closed 6 years ago

GerHobbelt commented 7 years ago

Current lexer rules don't identify an unterminated string in action code when it crosses a newline.

Example typo, where string is started as ES6 template but forgotten to terminate it with a backquote, keeping the old double quote at the end: note the code chunk at the first TODO:

option
    : NAME[option]
        { $$ = [$option, true]; }
    | NAME[option] '=' OPTION_STRING_VALUE[value]
        { $$ = [$option, $value]; }
    | NAME[option] '=' OPTION_VALUE[value]
        { $$ = [$option, parseValue($value)]; }
    | NAME[option] '=' NAME[value]
        { $$ = [$option, parseValue($value)]; }
    | NAME[option] '=' error
        {
            // TODO ...
            yyerror(`named %option value error for ${$option}?\n\n  Erroneous area:\n" + prettyPrintRange(yylexer, @error, @option));
        }
    | NAME[option] error
        {
            // TODO ...
            yyerror("named %option value assignment error?\n\n  Erroneous area:\n" + prettyPrintRange(yylexer, @error, @option));
        }
    ;
...
%%
...
// 500 lines down, right smack in the middle of the trailing code chunk,
// jison barfs a hairball about a "possibly missing semicolon?"
//     |:-(

results in an error report about 500 (!) lines down. Error diagnosis was easy, but required backpedaling through git log (diff inspection via TortoiseGit and Beyond Compare): jison should ideally be able to report something sensible near the error origin and not require the user to pull out the rest of their dev toolchest to dig out the mistake.

GerHobbelt commented 7 years ago

TODO: test to check if this is actually behaving better since 0.6.0-191 release, as this one was pretty horrible to discover. 😢