Open DreamGenX opened 5 months ago
You raise a very good point. I shall add the ability to correctly handle a stop sequence as a special token if it's just a single special token when tokenized. That way, you will be able to add something like <|eot_id|>
to stop_sequences
and it will just work.
Should be fixed in latest release
Thank you very much @LostRuins -- is there any reason to also not render the special tokens? That way you could e.g. do stop sequence like <|im_start|>user
and that indeed works with backends that rely on HF tokenizers like vLLM or Aphrodite -- and iirc, it also work for koboldcpp and llamacpp with some other models (e.g. my older Mistral 7B chat ml model)
Well, special tokens don't always have a string representation. Most of the time, they're unwanted in output, and piping them there would require clients to manually parse and get rid of them before displaying the content to the user. Also, the current behavior that upstream llama.cpp is to map all special tokens to the empty string when detokenizing. To be honest I do wish they had used regular tokens instead.
@LostRuins Most backends like vllm etc. have this as an option (to render or not render special tokens). Rendering special tokens allows you to properly parse the response, which is useful when the output is semi-structured ala ChatML and is one of the reason for having special tokens in the first place.
Regarding your side-note, special tokens have lots of nice advantages compared to regular tokens, namely that they are always tokenized as one unit, have one purpose, and are not present in the input.
Hi, Should be fixed in the latest version, you can now pass render_special
to the api to force special tokens to be output.
Awesome, thank you!
Hello!
When loading this model as GGUF https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF the special tokens are not rendered for some reason, which breaks the "stop string" functionality.
Specifically, I am setting
<|im_end|>
as a stop string, but because the logic relies on string comparison and the token id belonging to that is rendered as empty, it never stops.The
<|im_end|>
is tokenized correctly as token id128009
, I checked by inspecting the prompt token id in debug mode. It's just not rendered.