jeffreykegler / Marpa--R2

Parse any language you can describe in BNF
https://jeffreykegler.github.io/Marpa-web-site/
Other
157 stars 23 forks source link

failure of MarpaX::Languages::ECMAScript::AST with 2.096 #208

Closed jddurand closed 10 years ago

jddurand commented 10 years ago

One of the test script in MarpaX::Languages::ECMAScript::AST fails like this:

not ok 4 - <undef>
#   Failed test '<undef>'
#   at t/asi.t line 38.

********************************
Failed test was with:--->"a = b
++c"<---.

Eventual $@ is:
"Error in SLIF parse: Unrecognized problem code: unpermitted mix of external and internal scanning
* String before error: a = b\n++
* The error was at line 2, column 3, and at character 0x0063 'c', ...
* here: c\s
Marpa::R2 exception at lib/MarpaX/Languages/ECMAScript/AST/Impl.pm line 134.

[Syntax error]

line:column 1:6 (Unicode newline count) 1:6 (\n count)

a = b
-----^"
********************************

Running this script with Marpa::R2's traces on says:

2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 DEBUG   5475 Creating grammar key
2014/10/25 06:29:58 DEBUG   5475 Creating grammar object
2014/10/25 06:29:58 TRACE   5475 Setting trace_terminals option
2014/10/25 06:29:58 TRACE   5475 Setting trace_values option
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" accepted lexeme L1c1 e1: IDENTIFIER; value="a"
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" discarded lexeme L1c2: _S_MANY
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" accepted lexeme L1c3 e2: ASSIGN; value="="
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" discarded lexeme L1c4: _S_MANY
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" accepted lexeme L1c5 e3: IDENTIFIER; value="b"
2014/10/25 06:29:58 TRACE   5475 Lexer "L0" discarded lexeme L1c6: _S_MANY

Uncaught exception from user code:
        Error in SLIF parse: Unrecognized problem code: unpermitted mix of external and internal scanning
        * String before error: a = b\n++
        * The error was at line 2, column 3, and at character 0x0063 'c', ...
        * here: c\s
        Marpa::R2 exception at lib/MarpaX/Languages/ECMAScript/AST/Impl.pm line 134.

        [Syntax error]

        line:column 1:6 (Unicode newline count) 1:6 (\n count)

        a = b
        -----^

        Context:

        P0 @4-4 L1c6 Literal -> . NullLiteral
        P1 @4-4 L1c6 Literal -> . BooleanLiteral
        P2 @4-4 L1c6 Literal -> . NumericLiteral
        P3 @4-4 L1c6 Literal -> . StringLiteral
        P4 @4-4 L1c6 Literal -> . RegularExpressionLiteral
        P5 @4-4 L1c6 PrimaryExpression -> . THIS
        P6 @4-4 L1c6 PrimaryExpression -> . IDENTIFIER
        P7 @4-4 L1c6 PrimaryExpression -> . Literal
        P8 @4-4 L1c6 PrimaryExpression -> . ArrayLiteral
        P9 @4-4 L1c6 PrimaryExpression -> . ObjectLiteral
        P10 @4-4 L1c6 PrimaryExpression -> . LPAREN Expression RPAREN
        P11 @4-4 L1c6 ArrayLiteral -> . LBRACKET Elisionopt RBRACKET
        P12 @4-4 L1c6 ArrayLiteral -> . LBRACKET ElementList RBRACKET
        P13 @4-4 L1c6 ArrayLiteral -> . LBRACKET ElementList COMMA Elisionopt RBRACKET
        P20 @4-4 L1c6 ObjectLiteral -> . LCURLY RCURLY
        P21 @4-4 L1c6 ObjectLiteral -> . LCURLY PropertyNameAndValueList RCURLY
        P22 @4-4 L1c6 ObjectLiteral -> . LCURLY PropertyNameAndValueList COMMA RCURLY
        P32 @4-4 L1c6 MemberExpression -> . PrimaryExpression
        P33 @4-4 L1c6 MemberExpression -> . FunctionExpression
        P34 @4-4 L1c6 MemberExpression -> . MemberExpression LBRACKET Expression RBRACKET
        P35 @4-4 L1c6 MemberExpression -> . MemberExpression DOT IDENTIFIERNAME
        P36 @4-4 L1c6 MemberExpression -> . NEW MemberExpression Arguments
        P37 @4-4 L1c6 NewExpression -> . MemberExpression
        P38 @4-4 L1c6 NewExpression -> . NEW NewExpression
        P39 @4-4 L1c6 CallExpression -> . MemberExpression Arguments
        P40 @4-4 L1c6 CallExpression -> . CallExpression Arguments
        P41 @4-4 L1c6 CallExpression -> . CallExpression LBRACKET Expression RBRACKET
        P42 @4-4 L1c6 CallExpression -> . CallExpression DOT IDENTIFIERNAME
        P47 @4-4 L1c6 LeftHandSideExpression -> . NewExpression
        P48 @4-4 L1c6 LeftHandSideExpression -> . CallExpression
        P49 @4-4 L1c6 PostfixExpression -> . LeftHandSideExpression
        P50 @4-4 L1c6 PostfixExpression -> . LeftHandSideExpression PLUSPLUS_POSTFIX
        P51 @4-4 L1c6 PostfixExpression -> . LeftHandSideExpression MINUSMINUS_POSTFIX
        P52 @4-4 L1c6 UnaryExpression -> . PostfixExpression
        P53 @4-4 L1c6 UnaryExpression -> . DELETE UnaryExpression
        P54 @4-4 L1c6 UnaryExpression -> . VOID UnaryExpression
        P55 @4-4 L1c6 UnaryExpression -> . TYPEOF UnaryExpression
        P56 @4-4 L1c6 UnaryExpression -> . PLUSPLUS UnaryExpression
        P57 @4-4 L1c6 UnaryExpression -> . MINUSMINUS UnaryExpression
        P58 @4-4 L1c6 UnaryExpression -> . PLUS UnaryExpression
        P59 @4-4 L1c6 UnaryExpression -> . MINUS UnaryExpression
        P60 @4-4 L1c6 UnaryExpression -> . INVERT UnaryExpression
        P61 @4-4 L1c6 UnaryExpression -> . NOT UnaryExpression
        P62 @4-4 L1c6 MultiplicativeExpression -> . UnaryExpression
        P63 @4-4 L1c6 MultiplicativeExpression -> . MultiplicativeExpression MUL UnaryExpression
        P64 @4-4 L1c6 MultiplicativeExpression -> . MultiplicativeExpression DIV UnaryExpression
        P65 @4-4 L1c6 MultiplicativeExpression -> . MultiplicativeExpression MODULUS UnaryExpression
        P66 @4-4 L1c6 AdditiveExpression -> . MultiplicativeExpression
        P67 @4-4 L1c6 AdditiveExpression -> . AdditiveExpression PLUS MultiplicativeExpression
        P68 @4-4 L1c6 AdditiveExpression -> . AdditiveExpression MINUS MultiplicativeExpression
        P69 @4-4 L1c6 ShiftExpression -> . AdditiveExpression
        P70 @4-4 L1c6 ShiftExpression -> . ShiftExpression LEFTMOVE AdditiveExpression
        P71 @4-4 L1c6 ShiftExpression -> . ShiftExpression RIGHTMOVE AdditiveExpression
        P72 @4-4 L1c6 ShiftExpression -> . ShiftExpression RIGHTMOVEFILL AdditiveExpression
        P73 @4-4 L1c6 RelationalExpression -> . ShiftExpression
        P74 @4-4 L1c6 RelationalExpression -> . RelationalExpression LT ShiftExpression
        P75 @4-4 L1c6 RelationalExpression -> . RelationalExpression GT ShiftExpression
        P76 @4-4 L1c6 RelationalExpression -> . RelationalExpression LE ShiftExpression
        P77 @4-4 L1c6 RelationalExpression -> . RelationalExpression GE ShiftExpression
        P78 @4-4 L1c6 RelationalExpression -> . RelationalExpression INSTANCEOF ShiftExpression
        P79 @4-4 L1c6 RelationalExpression -> . RelationalExpression IN ShiftExpression
        P86 @4-4 L1c6 EqualityExpression -> . RelationalExpression
        P87 @4-4 L1c6 EqualityExpression -> . EqualityExpression EQ RelationalExpression
        P88 @4-4 L1c6 EqualityExpression -> . EqualityExpression NE RelationalExpression
        P89 @4-4 L1c6 EqualityExpression -> . EqualityExpression STRICTEQ RelationalExpression
        P90 @4-4 L1c6 EqualityExpression -> . EqualityExpression STRICTNE RelationalExpression
        P96 @4-4 L1c6 BitwiseANDExpression -> . EqualityExpression
        P97 @4-4 L1c6 BitwiseANDExpression -> . BitwiseANDExpression BITAND EqualityExpression
        P100 @4-4 L1c6 BitwiseXORExpression -> . BitwiseANDExpression
        P101 @4-4 L1c6 BitwiseXORExpression -> . BitwiseXORExpression BITXOR BitwiseANDExpression
        P104 @4-4 L1c6 BitwiseORExpression -> . BitwiseXORExpression
        P105 @4-4 L1c6 BitwiseORExpression -> . BitwiseORExpression BITOR BitwiseXORExpression
        P108 @4-4 L1c6 LogicalANDExpression -> . BitwiseORExpression
        P109 @4-4 L1c6 LogicalANDExpression -> . LogicalANDExpression AND BitwiseORExpression
        P112 @4-4 L1c6 LogicalORExpression -> . LogicalANDExpression
        P113 @4-4 L1c6 LogicalORExpression -> . LogicalORExpression OR LogicalANDExpression
        P116 @4-4 L1c6 ConditionalExpression -> . LogicalORExpression
        P117 @4-4 L1c6 ConditionalExpression -> . LogicalORExpression QUESTION_MARK AssignmentExpression COLON AssignmentExpression
        P120 @4-4 L1c6 AssignmentExpression -> . ConditionalExpression
        P121 @4-4 L1c6 AssignmentExpression -> . LeftHandSideExpression ASSIGN AssignmentExpression
        P122 @4-4 L1c6 AssignmentExpression -> . LeftHandSideExpression AssignmentOperator AssignmentExpression
        P137 @4-4 L1c6 Expression -> . AssignmentExpression
        P138 @4-4 L1c6 Expression -> . Expression COMMA AssignmentExpression
        P141 @4-4 L1c6 Statement -> . Block
        P142 @4-4 L1c6 Statement -> . VariableStatement
        P143 @4-4 L1c6 Statement -> . EmptyStatement
        P144 @4-4 L1c6 Statement -> . ExpressionStatement
        F144 @0-4 L1c1-6 Statement -> ExpressionStatement .
        P145 @4-4 L1c6 Statement -> . IfStatement
        P146 @4-4 L1c6 Statement -> . IterationStatement
        P147 @4-4 L1c6 Statement -> . ContinueStatement
        P148 @4-4 L1c6 Statement -> . BreakStatement
        P149 @4-4 L1c6 Statement -> . ReturnStatement
        P150 @4-4 L1c6 Statement -> . WithStatement
        P151 @4-4 L1c6 Statement -> . LabelledStatement
        P152 @4-4 L1c6 Statement -> . SwitchStatement
        P153 @4-4 L1c6 Statement -> . ThrowStatement
        P154 @4-4 L1c6 Statement -> . TryStatement
        P155 @4-4 L1c6 Statement -> . DebuggerStatement
        P156 @4-4 L1c6 Block -> . LCURLY_BLOCK StatementListopt RCURLY
        P159 @4-4 L1c6 VariableStatement -> . VAR VariableDeclarationList SEMICOLON
        P172 @4-4 L1c6 EmptyStatement -> . VISIBLE_SEMICOLON
        P173 @4-4 L1c6 ExpressionStatement -> . Expression SEMICOLON
        F173 @0-4 L1c1-6 ExpressionStatement -> Expression SEMICOLON .
        P174 @4-4 L1c6 IfStatement -> . IF LPAREN Expression RPAREN Statement ELSE Statement
        P175 @4-4 L1c6 IfStatement -> . IF LPAREN Expression RPAREN Statement
        P180 @4-4 L1c6 IterationStatement -> . DO Statement WHILE LPAREN Expression RPAREN SEMICOLON
        P181 @4-4 L1c6 IterationStatement -> . WHILE LPAREN Expression RPAREN Statement
        P182 @4-4 L1c6 IterationStatement -> . FOR LPAREN ExpressionNoInopt VISIBLE_SEMICOLON Expressionopt VISIBLE_SEMICOLON Expressionopt RPAREN Statement
        P183 @4-4 L1c6 IterationStatement -> . FOR LPAREN VAR VariableDeclarationListNoIn VISIBLE_SEMICOLON Expressionopt VISIBLE_SEMICOLON Expressionopt RPAREN Statement
        P184 @4-4 L1c6 IterationStatement -> . FOR LPAREN LeftHandSideExpression IN Expression RPAREN Statement
        P185 @4-4 L1c6 IterationStatement -> . FOR LPAREN VAR VariableDeclarationNoIn IN Expression RPAREN Statement
        P186 @4-4 L1c6 ContinueStatement -> . CONTINUE SEMICOLON
        P187 @4-4 L1c6 ContinueStatement -> . CONTINUE INVISIBLE_SEMICOLON
        P188 @4-4 L1c6 ContinueStatement -> . CONTINUE IDENTIFIER SEMICOLON
        P189 @4-4 L1c6 BreakStatement -> . BREAK SEMICOLON
        P190 @4-4 L1c6 BreakStatement -> . BREAK INVISIBLE_SEMICOLON
        P191 @4-4 L1c6 BreakStatement -> . BREAK IDENTIFIER SEMICOLON
        P192 @4-4 L1c6 ReturnStatement -> . RETURN SEMICOLON
        P193 @4-4 L1c6 ReturnStatement -> . RETURN INVISIBLE_SEMICOLON
        P194 @4-4 L1c6 ReturnStatement -> . RETURN Expression SEMICOLON
        P195 @4-4 L1c6 WithStatement -> . WITH LPAREN Expression RPAREN Statement
        P196 @4-4 L1c6 SwitchStatement -> . SWITCH LPAREN Expression RPAREN CaseBlock
        P207 @4-4 L1c6 LabelledStatement -> . IDENTIFIER COLON Statement
        P208 @4-4 L1c6 ThrowStatement -> . THROW Expression SEMICOLON
        P209 @4-4 L1c6 TryStatement -> . TRY Block Catch
        P210 @4-4 L1c6 TryStatement -> . TRY Block Finally
        P211 @4-4 L1c6 TryStatement -> . TRY Block Catch Finally
        P214 @4-4 L1c6 DebuggerStatement -> . DEBUGGER SEMICOLON
        P215 @4-4 L1c6 FunctionDeclaration -> . FUNCTION IDENTIFIER LPAREN FormalParameterListopt RPAREN LCURLY FunctionBody RCURLY
        P218 @4-4 L1c6 FunctionExpression -> . FUNCTION Identifieropt LPAREN FormalParameterListopt RPAREN LCURLY FunctionBody RCURLY
        F223 @0-4 L1c1-6 SourceElementsopt -> SourceElements .
        F226 @0-4 L1c1-6 Program -> SourceElementsopt .
        F227 @0-4 L1c1-6 SourceElements -> SourceElement .
        R228:1 @0-4 L1c1-6 SourceElements -> SourceElements . SourceElement
        P229 @4-4 L1c6 SourceElement -> . Statement
        F229 @0-4 L1c1-6 SourceElement -> Statement .
        P230 @4-4 L1c6 SourceElement -> . FunctionDeclaration
        P231 @4-4 L1c6 NullLiteral -> . NULL
        P232 @4-4 L1c6 BooleanLiteral -> . TRUE
        P233 @4-4 L1c6 BooleanLiteral -> . FALSE
        P234 @4-4 L1c6 StringLiteral -> . STRINGLITERAL
        P235 @4-4 L1c6 RegularExpressionLiteral -> . REGULAREXPRESSIONLITERAL
        P236 @4-4 L1c6 NumericLiteral -> . DecimalLiteral
        P237 @4-4 L1c6 NumericLiteral -> . HexIntegerLiteral
        P238 @4-4 L1c6 NumericLiteral -> . OctalIntegerLiteral
        P239 @4-4 L1c6 DecimalLiteral -> . DECIMALLITERAL
        P240 @4-4 L1c6 HexIntegerLiteral -> . HEXINTEGERLITERAL
        P241 @4-4 L1c6 OctalIntegerLiteral -> . OCTALINTEGERLITERAL
        F242 @0-4 L1c1-6 :start -> Program .
        Exception::Class::Base::throw("MarpaX::Languages::ECMAScript::AST::Exception::SyntaxError", "error", "Error in SLIF parse: Unrecognized problem code: unpermitted m"...) called at /usr/share/perl5/Exception/Class.pm line 167
        Exception::Class::__ANON__("error", "Error in SLIF parse: Unrecognized problem code: unpermitted m"...) called at lib/MarpaX/Languages/ECMAScript/AST/Grammar/Base.pm line 271
        MarpaX::Languages::ECMAScript::AST::Grammar::Base::_callback(MarpaX::Languages::ECMAScript::AST::Grammar::ECMAScript_262_5::Program=HASH(0xb27c518), "a = b\x{a}++c ", 8, 9, MarpaX::Languages::ECMAScript::AST::Impl=HASH(0xb27c48c), CODE(0xa6cd868), "", MarpaX::Languages::ECMAScript::AST::Grammar::ECMAScript_262_5::Program=HASH(0xb27c518)) called at lib/MarpaX/Languages/ECMAScript/AST/Grammar/Base.pm line 353
        MarpaX::Languages::ECMAScript::AST::Grammar::Base::parse(MarpaX::Languages::ECMAScript::AST::Grammar::ECMAScript_262_5::Program=HASH(0xb27c518), "a = b\x{a}++c ", MarpaX::Languages::ECMAScript::AST::Impl=HASH(0xb27c48c), HASH(0xbc0bc84)) called at lib/MarpaX/Languages/ECMAScript/AST/Grammar/ECMAScript_262_5/Program.pm line 178
        MarpaX::Languages::ECMAScript::AST::Grammar::ECMAScript_262_5::Program::parse(MarpaX::Languages::ECMAScript::AST::Grammar::ECMAScript_262_5::Program=HASH(0xb27c518), "a = b\x{a}++c", MarpaX::Languages::ECMAScript::AST::Impl=HASH(0xb27c48c)) called at lib/MarpaX/Languages/ECMAScript/AST.pm line 304
        MarpaX::Languages::ECMAScript::AST::__ANON__() called at lib/MarpaX/Languages/ECMAScript/AST.pm line 322
        MarpaX::Languages::ECMAScript::AST::parse(MarpaX::Languages::ECMAScript::AST=HASH(0x978aa3c), "a = b\x{a}++c") called at t/bug.pl line 21
jddurand commented 10 years ago

Note: reproduced with 2.097003

jeffreykegler commented 10 years ago

Tokens at a specific location should be read either externally with lexeme_alternative() or lexeme_read(). There is a point at which you call lexeme_alternative(), but do not call lexeme_complete() before calling $slif_recce->resume(), so that you'd wind up with both internally and externally read lexemes.

Previously, I had no error message to detect this, but when adding the new feature set, I added this.

jddurand commented 10 years ago

Since 100% of the calls to Marpa go via an Impl.pm file; let me put trace statement in all its method to check.

jddurand commented 10 years ago

There is no call to lexeme_alternative() in the whole of MarpaX::Languages::ECMAScript::AST. Only lexeme_read(). Here is the full sequence of marpa calls:

There is only lexeme_read().

2014/10/25 22:58:32 DEBUG  20840 Marpa::R2::Scanless::G->new()
2014/10/25 22:58:32 DEBUG  20840 Marpa::R2::Scanless::R->new()
2014/10/25 22:58:32 TRACE  20840 Setting trace_terminals option

2014/10/25 22:58:32 TRACE  20840 Setting trace_values option

2014/10/25 22:58:32 DEBUG  20840 recce->read
2014/10/25 22:58:32 TRACE  20840 Lexer "L0" accepted lexeme L1c1 e1: IDENTIFIER; value="a"

2014/10/25 22:58:32 DEBUG  20840 recce->events
2014/10/25 22:58:32 DEBUG  20840 recce->current_g1_location
2014/10/25 22:58:32 DEBUG  20840 recce->g1_location_to_span
2014/10/25 22:58:32 DEBUG  20840 recce->literal
2014/10/25 22:58:32 DEBUG  20840 recce->resume
2014/10/25 22:58:32 TRACE  20840 Lexer "L0" discarded lexeme L1c2: _S_MANY

2014/10/25 22:58:32 TRACE  20840 Lexer "L0" accepted lexeme L1c3 e2: ASSIGN; value="="

2014/10/25 22:58:32 TRACE  20840 Lexer "L0" discarded lexeme L1c4: _S_MANY

2014/10/25 22:58:32 TRACE  20840 Lexer "L0" accepted lexeme L1c5 e3: IDENTIFIER; value="b"

2014/10/25 22:58:32 DEBUG  20840 recce->events
2014/10/25 22:58:32 DEBUG  20840 recce->current_g1_location
2014/10/25 22:58:32 DEBUG  20840 recce->g1_location_to_span
2014/10/25 22:58:32 DEBUG  20840 recce->literal
2014/10/25 22:58:32 DEBUG  20840 recce->resume
2014/10/25 22:58:32 TRACE  20840 Lexer "L0" discarded lexeme L1c6: _S_MANY

2014/10/25 22:58:32 DEBUG  20840 recce->events
2014/10/25 22:58:32 DEBUG  20840 recce->current_g1_location
2014/10/25 22:58:32 DEBUG  20840 recce->g1_location_to_span
2014/10/25 22:58:32 DEBUG  20840 recce->literal
2014/10/25 22:58:32 DEBUG  20840 recce->lexeme_read
2014/10/25 22:58:32 DEBUG  20840 recce->lexeme_read
2014/10/25 22:58:32 DEBUG  20840 recce->resume
2014/10/25 22:58:32 DEBUG  20840 recce->current_g1_location
2014/10/25 22:58:32 DEBUG  20840 recce->g1_location_to_span
2014/10/25 22:58:32 DEBUG  20840 recce->literal
2014/10/25 22:58:32 DEBUG  20840 recce->current_g1_location
2014/10/25 22:58:32 DEBUG  20840 recce->g1_location_to_span
2014/10/25 22:58:32 DEBUG  20840 recce->line_column
2014/10/25 22:58:32 DEBUG  20840 recce->show_progress
2014/10/25 22:58:32 DEBUG  20840 destroy Marpa::R2::Scanless::R
Uncaught exception from user code:
        Error in SLIF parse: Unrecognized problem code: unpermitted mix of external and internal scanning
        * String before error: a = b\n++
        * The error was at line 2, column 3, and at character 0x0063 'c', ...
        * here: c\s
        Marpa::R2 exception at lib/MarpaX/Languages/ECMAScript/AST/Impl.pm line 141.
jeffreykegler commented 10 years ago

Looking back, all the calls to lexeme_alternative() do take place indirectly via lexeme_read(). What's probably happening is that lexeme_read() is returning undef (for a rejected token) and you're not checking for the failure to accept any lexemes.

jddurand commented 10 years ago

Possibly, but then why did it work previously ? I am checking the code.

jddurand commented 10 years ago

Yes, I am doing a postfix '++' instead of a normal '++'. Checking all the lexeme_read() has show that it is a bug in my package. Many thanks for your time on this. Do you know if it is possible to move an issue from one repo to another -; ?

jeffreykegler commented 10 years ago

It's possible to refer to an issue in another repo. Some of mine do that. What I'd do is open a new issue with a reference to this one.

I take it I can close this.

jddurand commented 10 years ago

Reference done. Thanks.