Open empty2enrich opened 8 months ago
I am using the qwen 72B model, and the specified --conv-template does not take effect. If the stop parameter is not specified when calling, the conversation will never end.
启动命令
CUDA_VISIBLE_DEVICES=2,3 nohup python -m fastchat.serve.vllm_worker --conv-template qwen-7b-chat --model-path ./finetune_all_weight/checkpoint-6 --trust-remote-code --tensor-parallel-size 2 --dtype bfloat16 --model-names Qwen-72B-ft --gpu-memory-utilization 1 --port 31002 --worker-address http://localhost:31002 > nohup.out.Qwen-72B-ft &
调用代码
content = 'xxx' client = OpenAI( api_key=openai.api_key, base_url=openai.base_url, ) chat_response = client.chat.completions.create( model=model_name, messages=messages + [ {"role": "user", "content": content}, ], **# stop=['<|im_end|>']** ) print("Chat response:", chat_response) content = chat_response.choices[0].message.content return content
I have the same problem here, any progress on it?
I am using the qwen 72B model, and the specified --conv-template does not take effect. If the stop parameter is not specified when calling, the conversation will never end.
启动命令
调用代码