Qwen2-1.5B-Instruct重复回答的问题

bestzy6 commented 2 months ago

我测试Qwen2-1.5B-Instruct时遇到了重复回答的问题，这属于Bug吗？模型是在modelscope上下载的，没有调整配置参数。

以下是prompt。“XXX”为匿名处理

{ "model": "Qwen2-1.5B-Instruct", "messages": [ { "role": "system", "content": "Please write a passage to answer the question. Try to include as many key details as possible. Write by Chinese!" }, { "role": "user", "content": "context: XXX \n\n Passage:" } ], "stream": false, "max_tokens": 1024, "temperature": 0.4 }

模型一直重复最后一句话无法停止，直到达到max_token。结果如下：

XXX，一家中国知名的汽车制造商，已在中国和全球范围内推出了一系列电动汽车。XXX的电动汽车以其高效、环保和经济性而闻名。XXX的电动汽车使用锂电池，这些电池具有长寿命和低维护需求，因此它们能够为消费者提供更长的使用寿命。XXX的电动汽车还配备了先进的自动驾驶技术，如自动泊车和自适应巡航控制，以提高驾驶安全性。此外，XXX的电动汽车还配备了先进的安全系统，如紧急制动、自动紧急转向和自动紧急制动，以确保驾驶安全性。XXX的电动汽车还配备了先进的充电技术，如快速充电和无线充电，以确保驾驶者的便利性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。.........

jklj077 commented 2 months ago

Hi,

I am unable to reproduce this with the given input. Could you share a full input for us to reproduce the issue?

bestzy6 commented 2 months ago

Hi,

I am unable to reproduce this with the given input. Could you share a full input for us to reproduce the issue?

full input 是 “比亚迪电动车项目相关的文档”，这个问题是偶现的。

jklj077 commented 2 months ago

Hi,

I have tried to reproduce this with transformers over 20 times with the given input and no luck for me. So I don't think it is a model issue.

What frameworks were you using?

bestzy6 commented 2 months ago

@jklj077 我使用了vllm进行推理，显卡是T4。

Package Version

accelerate 0.27.2 addict 2.4.0 aiofiles 23.2.1 aiohttp 3.9.3 aiosignal 1.3.1 aliyun-python-sdk-core 2.15.0 aliyun-python-sdk-kms 2.16.2 altair 5.2.0 annotated-types 0.6.0 anyio 4.3.0 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 blinker 1.7.0 cachetools 5.3.3 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 cmake 3.29.5 colorama 0.4.6 comm 0.2.1 contourpy 1.2.0 cpm-kernels 1.0.11 crcmod 1.7 cryptography 42.0.5 cupy-cuda12x 12.1.0 cycler 0.12.1 datasets 2.18.0 debugpy 1.8.1 decorator 5.1.1 dill 0.3.8 diskcache 5.6.3 distro 1.9.0 docstring-parser 0.15 einops 0.7.0 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.110.0 fastrlock 0.8.2 ffmpy 0.3.2 filelock 3.13.1 flash-attn 1.0.9 fonttools 4.49.0 frozenlist 1.4.1 fsspec 2024.2.0 gast 0.5.4 gitdb 4.0.11 GitPython 3.1.42 gradio 3.50.2 gradio_client 0.6.1 h11 0.14.0 httpcore 1.0.4 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.3 idna 3.6 importlib_metadata 7.0.2 importlib_resources 6.1.3 interegular 0.3.3 ipykernel 6.29.3 ipython 8.22.2 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.3 jmespath 0.10.0 joblib 1.3.2 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 jupyter_client 8.6.0 jupyter_core 5.7.1 kiwisolver 1.4.5 lark 1.1.9 latex2mathml 3.77.0 llvmlite 0.42.0 lm-format-enforcer 0.10.1 loguru 0.7.2 Markdown 3.5.2 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.8.3 matplotlib-inline 0.1.6 mdtex2html 1.3.0 mdurl 0.1.2 modelscope 1.13.1 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nest-asyncio 1.6.0 networkx 3.2.1 ninja 1.11.1.1 nltk 3.8.1 numba 0.59.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.555.43 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.4.99 nvidia-nvtx-cu12 12.1.105 openai 1.33.0 orjson 3.9.15 oss2 2.18.4 outlines 0.0.34 packaging 23.2 pandas 2.2.1 parso 0.8.3 peft 0.9.0 pexpect 4.9.0 pillow 10.2.0 pip 23.3.1 platformdirs 4.2.0 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 prompt-toolkit 3.0.43 protobuf 4.25.3 psutil 5.9.8 ptyprocess 0.7.0 pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 15.0.1 pyarrow-hotfix 0.6 pycparser 2.21 pycryptodome 3.20.0 pydantic 2.6.3 pydantic_core 2.16.3 pydeck 0.8.1b0 pydub 0.25.1 Pygments 2.17.2 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 pyzmq 25.1.2 ray 2.9.3 referencing 0.33.0 regex 2023.12.25 requests 2.31.0 rich 13.7.1 rouge-chinese 1.0.3 rpds-py 0.18.0 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.3.2 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.12.0 semantic-version 2.10.0 sentence-transformers 2.5.1 sentencepiece 0.2.0 setuptools 68.2.2 shellingham 1.5.4 shtab 1.7.1 simplejson 3.19.2 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 sortedcontainers 2.4.0 sse-starlette 2.0.0 stack-data 0.6.3 starlette 0.36.3 streamlit 1.32.0 sympy 1.12 tenacity 8.2.3 threadpoolctl 3.3.0 tiktoken 0.6.0 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.3.0 tornado 6.4 tqdm 4.66.2 traitlets 5.14.1 transformers 4.41.2 transformers-stream-generator 0.0.5 triton 2.3.0 trl 0.7.11 typer 0.9.0 typing_extensions 4.10.0 tyro 0.7.3 tzdata 2024.1 urllib3 2.2.1 uvicorn 0.28.0 uvloop 0.19.0 vllm 0.4.3 vllm-flash-attn 2.5.8.post2 watchdog 4.0.0 watchfiles 0.21.0 wcwidth 0.2.13 websockets 11.0.3 wheel 0.41.2 xformers 0.0.26.post1 xxhash 3.4.1 yapf 0.40.2 yarl 1.9.4 zipp 3.17.0

mide123 commented 2 months ago

我用的[Qwen2-72B-Instruct]也是有这个问题，老是重复输出内容

Kk1984up commented 2 months ago

我用vllm部署qwen2-72b-instruct也是经常输出输出的问题，这个是模型本身的问题吧，还是vllm部署的问题 @jklj077

JustinLin610 commented 2 months ago

Have you guys tried the demo on HF and ModelScope? You can compare that with your own deployment. And as hyperparameters matter, I don't think changing the temperature to 0.4 is a good choice. Please try with the original hyperparameters for generation.

mide123 commented 2 months ago

Thank you very much for your reply. I'll give it a try

------------------ 原始邮件 ------------------ 发件人: "QwenLM/Qwen2" @.>; 发送时间: 2024年6月17日(星期一) 下午3:28 @.>; @.**@.>; 主题: Re: [QwenLM/Qwen2] Qwen2-1.5B-Instruct重复回答的问题 (Issue #540)

Have you guys tried the demo on HF and ModelScope? You can compare that with your own deployment. And as hyperparameters matter, I don't think changing the temperature to 0.4 is a good choice. Please try with the original hyperparameters for generation.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

AllenLeong commented 2 months ago

I have the same issue using Qwen2-1.5B. The output keep giving the next word prediction. It seems the model is not able to predict an [END] token. You can reproduce it in my Colab

JacquelineXu commented 2 months ago

@AllenLeong Your model_name is Qwen2-1.5B. It's a base model trained on the next token prediction task. Use qwen/Qwen2-1.5B-Instruct instead.

AllenLeong commented 2 months ago

@AllenLeong Your model_name is Qwen2-1.5B. It's a base model trained on the next token prediction task. Use qwen/Qwen2-1.5B-Instruct instead.

Thanks, thanks, it's very helpful

github-actions[bot] commented 1 month ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.

acupofespresso commented 2 weeks ago

我测试Qwen2-1.5B-Instruct时遇到了重复回答的问题，这属于Bug吗？模型是在modelscope上下载的，没有调整配置参数。

以下是prompt。“XXX”为匿名处理

{ "model": "Qwen2-1.5B-Instruct", "messages": [ { "role": "system", "content": "Please write a passage to answer the question. Try to include as many key details as possible. Write by Chinese!" }, { "role": "user", "content": "context: XXX \n\n Passage:" } ], "stream": false, "max_tokens": 1024, "temperature": 0.4 }

模型一直重复最后一句话无法停止，直到达到max_token。结果如下：

XXX，一家中国知名的汽车制造商，已在中国和全球范围内推出了一系列电动汽车。XXX的电动汽车以其高效、环保和经济性而闻名。XXX的电动汽车使用锂电池，这些电池具有长寿命和低维护需求，因此它们能够为消费者提供更长的使用寿命。XXX的电动汽车还配备了先进的自动驾驶技术，如自动泊车和自适应巡航控制，以提高驾驶安全性。此外，XXX的电动汽车还配备了先进的安全系统，如紧急制动、自动紧急转向和自动紧急制动，以确保驾驶安全性。XXX的电动汽车还配备了先进的充电技术，如快速充电和无线充电，以确保驾驶者的便利性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。XXX的电动汽车还配备了先进的电池管理系统，以确保电池的稳定性和安全性。.........

参考这个文档： https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html# 设置重复惩罚参数，可以解决重复生成的问题 repetition_penalty=1.05

如果是调用OpenAI的API，可以设置 frequency_penalty=1.05

QwenLM / Qwen2

Qwen2-1.5B-Instruct重复回答的问题 #540