abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.08k stars 961 forks source link

Text generation issue with missing characters #286

Closed 1980Dragon closed 10 months ago

1980Dragon commented 1 year ago

Hello, I wanted to report an issue with non-English language output after a recent patch. I'm a Korean user and frequently utilize the Korean language model. Until recently, I had no problems, but when using the latest version of llama-cpp-python, I noticed that characters are occasionally missing within the generated text.(I'm using 0.1.55)

It seems to be related to the decoding process of tokens in the _create_completion function in llama.py, although I'm not entirely certain as I'm a beginner in coding. When I replace that function with an older version, the issue of missing characters disappears, leading me to suspect that it might be the cause. However, I haven't tested it with other foreign languages that have different character systems, such as Chinese or Japanese, so I'm not sure if it's specific to Korean. But when generating Korean text, the problem occurs consistently.

I would greatly appreciate it if you could look into resolving this issue. Many of my activities heavily rely on llama-cpp-python, so it would be quite inconvenient if I can't use the latest version. Thank you for your attention.

Example: When inferring the Korean translation of 'Beautiful Woman', it should result in '아름다운 여자', but instead it produces '아다 여자'. All the intermediate '름' and '다's are missing.

nai-kon commented 1 year ago

I also confirmed same issue in Japanese. After doing some experiments, I figure out below.

I found there were huge updates on decoding multi-byte characters at streaming mode in llama.py, between 0.1.50 and 0.1.51. I suppose this causes the issue.

abetlen commented 1 year ago

@1980Dragon @nai-kon I'm really sorry about that, I had to re-factor some of the completion logic and the previous fix for multi-byte unicode sequences was incompabitble with the changes. I'll try to migrate it over to the new logic, could you please provide me with some examples of expected inputs and outputs? Much appreciated.

1980Dragon commented 1 year ago

Thank you for your response. As a beginner in coding, I am not sure about the most effective way to present information. However, I have attempted to create a few examples below as a trial, but I am unsure if they are sufficient. If more examples are needed or if a different format is required, please let me know. Thank you once again.

WizardLM-7B-uncensored.ggmlv3.q5_1, using Text generation web UI.

[Format] Question: Translate '{English Word}' to Korean. Translation: {expected output} {current output}

Question: Translate 'Hello' to Korean. Translation: 안녕하세요 (안하세요)

Question: Translate 'Computer' to Korean. Translation: 컴퓨터 (터)

Question: Translate 'America' to Korean. Translation: 아메리카 (아리)

Question: Translate 'Love' to Korean Translation: 사랑 (사)

Question: Translate 'Morning' to Korean Translation: 아침 (아)

nai-kon commented 1 year ago

Thank you for your reply. Due to the LLM's reproducibility, I can't come up with the best provide way, but 1980Dragon's translation examples seems good idea. Here is an example in Japanese.

Translate "newspaper" to Japanese.

Again, Thank you for this great work!

MeouSker77 commented 1 year ago

Hello, I also confirmed same issue in Chinese.

And I figure out it happens since this commit: dc39cc0fa410f8b46954ad507b705052947da6bc

Hope it helps, thanks

MeouSker77 commented 1 year ago

@abetlen Hi, I have a simple fix for this bug: #309. Although it's just a draft, it should serve as a reference.

Thanks

gjmulder commented 1 year ago

@abetlen ping

williamchai commented 1 year ago

Just tested @MeouSker77 's draft fix works for Korean/Japanese/Chinese

1980Dragon commented 1 year ago

Great news.

williamchai commented 1 year ago

@1980Dragon btw the draft fix is not merged yet

nai-kon commented 1 year ago

@1980Dragon Please re open this issue. Don't close this until the PR merged.

1980Dragon commented 1 year ago

Oh, sorry. I'm not familiar with the workflow on this GitHub. I've reopened the issue.

Phate334 commented 1 year ago

@abetlen Hi, I have a simple fix for this bug: #309. Although it's just a draft, it should serve as a reference.

Thanks

I tried this modification in oobabooga/text-generation-webui and it worked fine.

LiuYuWei commented 1 year ago

I hope to express through this message that the Pull Request (PR) submitted by MeouSker77 can be beneficial to me. This PR is extremely important for the output of Traditional Chinese characters as it can resolve the existing issue of missing characters.

While using the current version, we found that occasionally, there would be missing characters when outputting Traditional Chinese text. This not only affects the user experience but can also lead to the loss or misinterpretation of important information.

In MeouSker77's PR, they made some adjustments and optimizations to the code to ensure that no characters are missed when outputting Traditional Chinese text. We have conducted multiple tests on this PR and confirmed its effectiveness and stability.

Therefore, we strongly recommend that you consider this PR, as it not only solves a significant issue but also enhances the user experience.

Before: image

After image

williamchai commented 1 year ago

309 has been merged!

hzgdeerHo commented 7 months ago

This issue still confused me much when I use the llama-cpp-python to load model ,anyone can fix the problem?

args.model_name_or_path='TheBloke/WizardCoder-33B-V1.1-GGUF' llm = Llama.from_pretrained( repo_id=args.model_name_or_path, chat_format="llama-2",

        filename="wizardcoder-33b-v1.1.Q6_K.gguf",
        n_ctx=12000,            
        tokenizer=tokenizer,
         n_gpu_layers=-1,

        verbose=True
    )
    output = llm.create_chat_completion(
        messages=[
            { "role": "system", "content": """You are an AI that follows instructions extremely well. 
            Help as much as you can. Remember, response in Chinese.      

"""}, { "role": "user", "content": user_prompt } ], stream=True, max_tokens=12000, repeat_penalty=1,

    )

注意,SQL语句中的字段名、表名都需要用反引号`包�起来,并且日期格式需要与数据库中的日期格式一致。 生成耗时:30.285314559936523,文字长度:353,每字耗时(sec/word):0.08579409223778052