Open knilink opened 3 months ago
Hi @knilink , thanks for reporting this! Do you know if this happens if you try to generate with llama-cpp-python directly? Getting the full stack trace here would be very helpful!
@paulbkoch might have thoughts here too
Hi @Harsha-Nori I did a bit more investigation and can confirm the error was caused by sending incomplete Unicode bytes to llama_cpp tokenizer
$ printf '\xe6\xad' | ./llama-tokenize -m ./Meta-Llama-3-8B-Instruct.Q8_0.gguf --stdin
terminate called after throwing an instance of 'std::invalid_argument'
what(): invalid character
Aborted
After adding byte_string.decode('utf8')
before
https://github.com/guidance-ai/guidance/blob/337738322f7d09f36613a4c40f86137c3a0a1553/guidance/models/llama_cpp/_llama_cpp.py#L78
I got the following stack trace:
Transformer model didn't have the issue because its _joint_tokenize
didn't use the tokenizer directly.
I didn't do much testing but copy TransformersEngine._joint_tokenize
over to LlamaCppEngine
seem to get the issue fixed.
@knilink , thank you for bringing this up. I've drafted a (very) tentative fix in #962 , which works by chopping off bytes given to the encode()
method until it has a valid UTF-8 string. However, I'm really concerned that this is going to be causing trouble for us elsewhere.
Have you filed your repro printf '\xe6\xad' | ./llama-tokenize -m ./Meta-Llama-3-8B-Instruct.Q8_0.gguf --stdin
as a bug with llamacpp?
I have been doing some more prodding based on @knilink 's examples, and I've opened a bug on the HF repo whence I grabbed the model (although this does look like something going wrong at the LlamaCpp layer): https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/discussions/9
Also filed the bug on LlamaCpp https://github.com/ggerganov/llama.cpp/issues/8691
The bug A strings containing certain unicode characters to causes an exception. Likely because
歪
is a multi-token characters for this tokenizerI also tested transformers model which seems to be working fine
To Reproduce
System info (please complete the following information): Ubuntu 22.04 Python 3.10.12
guidance==0.1.15 llama_cpp_python==0.2.79