Open zevisert opened 3 years ago
Note that this is something that might break compatibility. This is something we have in mind, and I think we also agree that it would be better for both parser to throw the same exception. (Note that this includes the possiblity of making the earley
parser throw UnexpectedToken
. But you are making a decent case to keeping UnexpectedEOF
).
While this is certainly a good change, this might only happen in 1.0. (or we temporary make UnexpectedEOF
behave like an UnexpectedToken
. But that seems a bit hacky.)
Yeah this is definitely a breaking change either way, as the different exception types can change the control flow of a program. You've seen my use case, so I would prefer both parsers to raise UnexpectedEOF
. That said, there's easy workarounds here until 1.0 lands.
Thanks for the great library!
Yes, consistancy would make error-catching much easier
Describe the bug
When an input is exhausted, the earley parser raises
lark.errors.UnexpectedEOF(...)
, while the lalr parser raiseslark.errors.UnexpectedToken('$END', ...)
.For consistency sake, in lalr parsers, if the error raised from an unexpected token is
'$END'
it should be re-raised asUnexpectedEOF
.Some extra context
I am building an application that requires parsing a stream, and I had switched to the (_much faster_) lalr parser, but as my stream may require assembling several 'chunks' to create a valid record, I was catching `UnexpectedEOF` from earley, but now I have to catch `UnexpectedToken` and drill into the error to check the token: ```py except lark.exceptions.UnexpectedToken as err: if err.token == lark.Token("$END", ""): logger.debug("Parser expected more data, waiting for another chunk") else: raise err ```To Reproduce
Output