[Bug] mlc_llm generates randomly corrupted Unicode character when outputting Chinese

LuRenJiasWorld commented 3 weeks ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Install latest mlc-llm and mlc-ai in conda with python 3.12, running on an Apple Silicon (M1 Pro) MacBook Pro with 32 GiB of RAM
Download Qwen-2-7b-MLC Model from https://huggingface.co/mlc-ai/Qwen2-7B-Instruct-q4f16_1-MLC (Other LLMs can also reproduce this issue)
Using mlc_llm serve Qwen2-7B-Instruct-q4f16_1-MLC --host 0.0.0.0 to run the server (mlc_llm chat can also reproduce this issue)
In any application that can produce many outputs (for example immersive translate working with OpenAI compatible API), I can see the following result, which contains many corrupted Chinese character.

When I use a Linux server with Nvidia L20 GPU, by using the same model, same application, same prompt, I could also reproduce this issue, but not as frequently as MacBook does.

Expected behavior

There should not have corrupted Unicode character when outputting Chinese, which is frustrating, makes me frequently guess what the word should be.

LuRenJiasWorld commented 3 weeks ago

It appeared again when I tried to translate this issue XD

LuRenJiasWorld commented 3 weeks ago

It seems like the corruption often occurred at the first Chinese character after a non-Chinese character, is tokenizers to be the cause?

MasterJH5574 commented 4 days ago

Hi @LuRenJiasWorld, sorry for the delayed response. Do you mind providing a Python script that runs with MLCEngine and we can use to reproduce? That will be very helpful for identifying the problem.

samuelqy commented 1 day ago

facing same issue

samuelqy commented 1 day ago

My best guess is tokenizer.decode inside tvm has some issues

mlc-ai / mlc-llm