Closed huangshengfu closed 4 months ago
To address the issue where your Docker-deployed model keeps repeating <im_start>
after completing a conversation, you should consider adjusting the stop
parameter in the LLMSingleActionAgent
initialization to include the stop_token_ids
. This involves specifying tokens at which the model should cease generating further output. If <im_start>
is the token indicating an unwanted repetition start, you can add it to the stop
list like so:
agent = LLMSingleActionAgent(
llm_chain=llm_chain,
output_parser=output_parser,
stop=["\nObservation:", "Observation", "<im_start>"], # Adding "<im_start>" to the list
allowed_tools=tool_names,
)
This modification instructs the LLMSingleActionAgent
to halt further output generation upon encountering any token listed in stop
, including the newly added <im_start>
token.
This solution is based on the assumption that you have access to and can modify the initialization of LLMSingleActionAgent
within your deployment setup. If you're looking for how to pass stop_token_ids
directly via Docker command parameters, it's important to note that such functionality might not be directly supported through Docker run commands without modifying the underlying application code to accept and process these parameters.
If you're encountering specific errors when running your Docker command or if there are other methods you've tried without success, providing more details could help in offering a more tailored solution.
To continue the conversation, mention @dosu.
请问下这段配置应该加到哪个文件中?
vllm 你应该自己设置 special token吧
聊天对话回答完问题后无法结束聊天一直在重复打印<|im_start|>
docker 运行参数命令: docker run --runtime nvidia --gpus all --name vllm-qwen14b -v ~/.cache/huggingface:/root/.cache/huggingface \ -v /data1/Download/models/Qwen-14B-Chat-Int4:/data/shared/Qwen/Qwen-Chat -p 8901:8000 --ipc=host \ vllm/vllm-openai:latest --model /data/shared/Qwen/Qwen-Chat --max-model-len 2048 --trust-remote-code \ --tensor-parallel-size 2 --gpu-memory-utilization 0.7 --api-key "xxxxx"
应该设置什么参数才能实现?stop_token_ids,我不知道该如何传?
预期的结果 / Expected Result 希望回答完毕后正常结束
实际结果 / Actual Result 如上图所示
环境信息 / Environment Information
附加信息 / Additional Information 添加与问题相关的任何其他信息 / Add any other information related to the issue.