Handle this specific utf8 errorNone: the end of the input was reached unexpectedly. self.valid_up_to() is 1 to 3 bytes from the end of the input. If a byte stream (such as a file or a network socket) is being decoded incrementally, this could be a valid char whose UTF-8 byte sequence is spanning multiple chunks.
The LLaMA Executor cuts off at arbitrary bytes which works for ascii but not multi-byte utf8. Add a buffer that will keep the extra bytes until the next chunk comes in where it will prepend it and add the missing bytes back onto it. It will also replace actually broken utf-8 (not just cut off) with the std::char::REPLACEMENT_CHARACTER instead of panicking the thread.
Fixes https://github.com/sobelio/llm-chain/issues/187
Handle this specific utf8 error
None: the end of the input was reached unexpectedly. self.valid_up_to() is 1 to 3 bytes from the end of the input. If a byte stream (such as a file or a network socket) is being decoded incrementally, this could be a valid char whose UTF-8 byte sequence is spanning multiple chunks.
The LLaMA
Executor
cuts off at arbitrary bytes which works for ascii but not multi-byte utf8. Add a buffer that will keep the extra bytes until the next chunk comes in where it will prepend it and add the missing bytes back onto it. It will also replace actually broken utf-8 (not just cut off) with thestd::char::REPLACEMENT_CHARACTER
instead of panicking the thread.