turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.45k stars 257 forks source link

Yi models generates some languages in an incorrect encoding #155

Closed samuelazran closed 9 months ago

samuelazran commented 10 months ago

For example, I prompt it to extract an entity from a text in Hebrew, and it produces: "���� ���" It seems to work ok with language understanding of Hebrew but the decoding of the tokens when generating Hebrew seems buggy only when using Exllamav2 (the same prompt with the same model works perfect with Huggingface Transformers).

What could be the issue? Could it be their custom tokenizer?

I've tested it with the latest exllamav2 (from the commit with the Yi models support addition) with several models including "01-ai/Yi-34B-200K" and "LoneStriker/Yi-34B-200K-6.0bpw-h6-exl2" from Huggingface models.

Attaching here a screenshot.

Screenshot 2023-11-09 at 14 06 39
turboderp commented 10 months ago

Thank you for noticing. It was actually nothing to do with the tokenizer but rather the streaming decoder which wasn't catching 2-byte UTF8 characters. It should be fixed with the latest commit:

image

samuelazran commented 10 months ago

Thank you for noticing. It was actually nothing to do with the tokenizer but rather the streaming decoder which wasn't catching 2-byte UTF8 characters. It should be fixed with the latest commit:

image

Thank you so much! Update: I have tested it and it works perfect now.

BTW, thanks for this library, this is in my experience the best inference library!