Open Demainlip opened 1 month ago
你部署接口的时候指定了模板名称吗,需要指定模板名称
python3 -m fastchat.serve.controller --host 0.0.0.0 --port 20001 python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 20000 --controller-address http://0.0.0.0:20001 python3 -m fastchat.serve.model_worker --host 0.0.0.0 --port 21001 --worker-address http://0.0.0.0:21001 --controller-address http://0.0.0.0:20001 --model-names "glm-4-9b-chat-1m" --model-path /home/LLM/glm-4-9b-chat-1m --device npu 这是我的代码
python3 -m fastchat.serve.model_worker --host 0.0.0.0 --port 21001 --worker-address http://0.0.0.0:21001/ --controller-address http://0.0.0.0:20001/ --model-names "glm-4-9b-chat-1m" --model-path /home/LLM/glm-4-9b-chat-1m --device npu --conv-template chatglm3
你在最后加上 --conv-template chatglm3 这个试试,我记得glm4的对话模板和chatglm3的模板一样
如果还是不行的话可以更改fastchat/conversation.py文件,模仿下面这个代码创建一个对话模板
fschat已经做过修改,才运行的
curl -X POST "http://0.0.0.0:20000/v1/chat/completions" -H "Content-Type: application/json" -d "{\"model\": \"glm-4-9b-chat-1m\", \"messages\": [{\"role\": \"system\", \"content\": \"You are ChatGLM4, a large language model trained by Zhipu.AI. Follow the user's instructions carefully. Respond using markdown.\"}, {\"role\": \"user\", \"content\": \"你好,给我讲一个故事,大概100字\"}], \"stream\": false, \"max_tokens\": 100, \"temperature\": 0.8, \"top_p\": 0.8}" {"id":"chatcmpl-uasCb9Ay2D5KtzSKeJn9ZM","object":"chat.completion","created":1724161672,"model":"glm-4-9b-chat-1m","choices":[{"index":0,"message":{"role":"assistant","content":"puty, the systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems, the Systems"},"finish_reason":"stop"}],"usage":{"prompt_tokens":46,"total_tokens":146,"completion_tokens":100}}