--conv-template parameter has no effect。

I am using the qwen 72B model, and the specified --conv-template does not take effect. If the stop parameter is not specified when calling, the conversation will never end.

启动命令

CUDA_VISIBLE_DEVICES=2,3 nohup python -m fastchat.serve.vllm_worker --conv-template qwen-7b-chat --model-path ./finetune_all_weight/checkpoint-6 --trust-remote-code --tensor-parallel-size 2 --dtype bfloat16 --model-names Qwen-72B-ft --gpu-memory-utilization 1 --port 31002 --worker-address http://localhost:31002 > nohup.out.Qwen-72B-ft &

调用代码

content = 'xxx'
client = OpenAI(
    api_key=openai.api_key,
    base_url=openai.base_url,
    )

  chat_response = client.chat.completions.create(
      model=model_name,
      messages=messages + [
          {"role": "user", "content": content},
      ],
      **# stop=['<|im_end|>']**
  )
  print("Chat response:", chat_response)
  content = chat_response.choices[0].message.content
  return content

lm-sys / FastChat

--conv-template parameter has no effect。 #2897