Closed dwhitena closed 1 year ago
Thanks @dwhitena! That makes sense. I actually had this logic initially (see https://github.com/r2d4/parserllm/blob/b6cadb25a02acc923382a584b294a4c6ede6324b/parserllm/parserllm.py#L17-L35) before we changed to the interactive parser. In that case, I just caught the exception and continued parsing in parserllm.
I think I like your solution more.
(A related problem I see for both methods is token counting -- counting (1) LLM tokens and (2) parser tokens. Would be an interesting feature to add to limit tokens in both ways)
I'll release a new version, bump it in parserllm, and do a release there (I'm still meaning to merge both repos in a monorepo).
First off, thanks so much for your work on this package! So useful.
When trying to use
rellm
withparserllm
(https://github.com/r2d4/parserllm), I kept gettingUnexpectedCharacter
exceptions from Lark. I tracked these down to the stop after match behavior inrellm
. When using the example JSON CFG, I would have something generated like:Note, the step 2 completion added in
":
and thus wasn't a full match, but I would actually want a stop after match here, because"positive"
would represent the next token in the JSON CFG parsing. This behavior eventually causes thenext_lex
method to raise an exception, because the step 5 completion was a full match (despite being an incomplete match to a JSON pattern).To fix this, I changed the
fullmatch
method inrellm
tomatch
along with a check to make sure the match starts at index 0. There might be a better way to do this, but it worked for me!