Hello everyone! I found Llama models like beomi/llama-2-ko-7b are giving junk output like \n[/INST]\n\n[/INST].... I tried with multiple Llama2 korean models and I am getting similar junk results. What may be the reason? Is it because of running on v100 GPU? But other models like NousResearch/llama-2-7b-chat-hf are working fine.
The difference between this Llama models and other is that it uses the Huggingface Fast tokenizer instead of the sentencepiece model used in the regular Llama models. Doesn't OpenLLM support Llama models without tokenizer.model file?
I tested other gptneox models like beomi/polyglot-ko-12.8b, and it works fine. So, I am wondering what may the issue.
Hello everyone! I found Llama models like
beomi/llama-2-ko-7b
are giving junk output like\n[/INST]\n\n[/INST]...
. I tried with multiple Llama2 korean models and I am getting similar junk results. What may be the reason? Is it because of running on v100 GPU? But other models likeNousResearch/llama-2-7b-chat-hf
are working fine. The difference between this Llama models and other is that it uses the Huggingface Fast tokenizer instead of the sentencepiece model used in the regular Llama models. Doesn't OpenLLM support Llama models withouttokenizer.model
file?I tested other gptneox models like
beomi/polyglot-ko-12.8b
, and it works fine. So, I am wondering what may the issue.Thank you!