Following up on #21, while doing a bit of code inspection on my own work through copy-edit/git-compare/diff via Beyond Compare, I noticed that the parser kernel has a few very subtle bugs in the error recovery parse loop: as the parse loop is duplicated in the error recovery section (so that I can optimize the main parse loop for regular operations and not worry about the special handling that is required for error recovery there), it does not hand over / drop out into the outer parse loop correctly:
when parseError() produces a 'parser return value', that one is DESTROYED in the ACCEPT phase of the outer loop.
the outer loop can be further optimized when it doesn't have to worry about a still-active 'recovery phase', i.e. recovering === 0 should be a precondition in the outer parse loop.
when (edge case) the lexer also is in the habit of producing TERROR tokens (some of my grammars do this), then we will loose theiryyval and yylloc! Hence we must differentiate between a TERROR as a replacement token set up in the parser kernel error recovery section and a TERROR token produced by the lexer: the latter is an error token too, but should only indirectly trigger error recovery by the parser.
Hint To Self: this means that an error term in a grammar production has an associated valuewhich is either a parser error recovery info object or a lexer-produced yyvalue, depending on whether the lexer TERROR-or-other token triggered parser error recovery or not! ... Talk about complex internals... 🤡
Following up on #21, while doing a bit of code inspection on my own work through copy-edit/git-compare/diff via Beyond Compare, I noticed that the parser kernel has a few very subtle bugs in the error recovery parse loop: as the parse loop is duplicated in the error recovery section (so that I can optimize the main parse loop for regular operations and not worry about the special handling that is required for error recovery there), it does not hand over / drop out into the outer parse loop correctly:
parseError()
produces a 'parser return value', that one is DESTROYED in the ACCEPT phase of the outer loop.recovering === 0
should be a precondition in the outer parse loop.when (edge case) the lexer also is in the habit of producing TERROR tokens (some of my grammars do this), then we will loose their
yyval
andyylloc
! Hence we must differentiate between a TERROR as a replacement token set up in the parser kernel error recovery section and a TERROR token produced by the lexer: the latter is an error token too, but should only indirectly trigger error recovery by the parser.Hint To Self: this means that an
error
term in a grammar production has an associated value which is either a parser error recovery info object or a lexer-producedyyvalue
, depending on whether the lexer TERROR-or-other token triggered parser error recovery or not! ... Talk about complex internals... 🤡