QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

[BUG] chat_stream时指定generation_config的eos_token_id为列表[198, 151643]时会报错。198是换行符,想让模型遇到换行符就停止生成。 #1001

Closed chenyzh28 closed 8 months ago

chenyzh28 commented 8 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

生成报错。

期望行为 | Expected Behavior

模型正常生成。

复现方法 | Steps To Reproduce

generation_config.eos_token_id = [198, 151643] model.chat_stream(tokenizer, prompt, history=None, generation_config=generation_config)

运行环境 | Environment

OS: Ubuntu 20.04
Python: 3.8
Transformers: 4.31.0
PyTorch: 2.0.1
CUDA: 11.4

备注 | Anything else?

E5744465-782B-4C8A-868E-0AB9A2035B6C

指定eos_token_id为[198, 151643]才会报错,按原始的eos_token_id=151643则不会报错,但是没法按照预期的及时停止生成。

jklj077 commented 8 months ago

generation_config里传stop_words_ids

chenyzh28 commented 8 months ago

可以了,感谢。