使用ollama本地部署的glm4或llama3.1，无法使用工具调用

LuckLittleBoy commented 1 month ago

例行检查

[ ] 我已确认目前没有类似 issue
[x] 我已完整查看过项目 README，以及项目文档
[x] 我使用了自己的 key，并确认我的 key 是可正常使用的
[x] 我理解并愿意跟进此 issue，协助测试和提供反馈
[x] 我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 issue 可能会被无视或直接关闭

你的版本

[ ] 公有云版本
[x] 私有部署版本, 具体版本号: v4.8.8-fix2

问题描述, 日志截图 1、使用ollama本地部署的glm4模型，未正确使用工具调用详细流程截图：联想截图_20240910142824 后台日志截图：联想截图_20240910143502 2、使用线上的glm4模型，正确使用工具调用详细流程截图：联想截图_20240910144100 后台日志截图：联想截图_20240910144221

复现步骤

预期结果 一开始以为是ollama不支持glm4的tools功能，后面改成支持的Llama 3.1，也是不行，这个是什么原因呢？联想截图_20240910152714

相关截图 模型配置文件截图： 1、ollama部署的glm4模型配置联想截图_20240910143638 2、线上glm4模型配置联想截图_20240910143741 OneApi渠道截图：联想截图_20240910144314 联想截图_20240910144405

RipperTs commented 1 month ago

我也遇到了类似情况，我使用的是 qwen2-7b 模型，在ollama支持工具调用模型列表中。

直接使用命令发送请求测试：

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2-7b-instruct",
  "messages": [
    {
      "role": "user",
      "content": "现在的时间是?"
    }
  ],
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_time",
        "description": "Get the current time",
        "parameters": {
          "type": "object",
          "properties": {},
          "required": []
        }
      }
    }
  ]
}'

非 stream 返回结果:

{
    "model": "qwen2-7b-instruct",
    "created_at": "2024-09-11T02:22:05.243532582Z",
    "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
            {
                "function": {
                    "name": "get_current_time",
                    "arguments": {}
                }
            }
        ]
    },
    "done_reason": "stop",
    "done": true,
    "total_duration": 3568470745,
    "load_duration": 3297845610,
    "prompt_eval_count": 132,
    "prompt_eval_duration": 48330000,
    "eval_count": 22,
    "eval_duration": 212189000
}

stream返回结果：

{"model":"qwen2-7b-instruct","created_at":"2024-09-11T02:26:27.086948067Z","message":{"role":"assistant","content":"\u003ctool"},"done":false}

{"model":"qwen2-7b-instruct","created_at":"2024-09-11T02:26:27.097531673Z","message":{"role":"assistant","content":"_call"},"done":false}

{"model":"qwen2-7b-instruct","created_at":"2024-09-11T02:26:27.107974534Z","message":{"role":"assistant","content":"\u003e\n"},"done":false}

.....

{"model":"qwen2-7b-instruct","created_at":"2024-09-11T02:26:27.294572387Z","message":{"role":"assistant","content":""},"done_reason":"stop","done":true,"total_duration":280436360,"load_duration":18731823,"prompt_eval_count":132,"prompt_eval_duration":35464000,"eval_count":22,"eval_duration":218382000}

FastGPT调用结果：

额外说明

模型名称我是起别名了，可以忽略
FastGPT版本：4.8.10
Ollama版本：0.3.10 （目前最新）
Ollama官方 API 文档提到对于工具调用时候需要设置 stream: false ，不知道是否因为此原因有些影响
我使用的是 new-api 项目，渠道类型选择了 OpenAI 和 Ollama 结果均相同

如果需要的话我可以提供一个对外的接口用于测试，我将接口信息发到指定邮箱

c121914yu commented 1 month ago

看来是 true 时候没返回标准的 tool 格式

RipperTs commented 1 month ago

sorry , 这是我的问题，我没有使用openai兼容的接口发送请求，下面是使用 openai 格式的接口调用ollama的结果：

发送请求：

curl http://localhost:11434/v1/chat/completions -d '{
  "model": "qwen2-7b-instruct",
  "messages": [
    {
      "role": "user",
      "content": "现在的时间是?"
    }
  ],
  "stream": true,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_time",
        "description": "Get the current time",
        "parameters": {
          "type": "object",
          "properties": {},
          "required": []
        }
      }
    }
  ]
}'

非流式结果：

{
    "id": "chatcmpl-128",
    "object": "chat.completion",
    "created": 1726104793,
    "model": "qwen2-7b-instruct",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "call_2vtmo4np",
                        "type": "function",
                        "function": {
                            "name": "get_current_time",
                            "arguments": "{}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": 132,
        "completion_tokens": 22,
        "total_tokens": 154
    }
}

流式返回结果：

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\u003ctool"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"_call"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\u003e\n"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"{\""},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"name"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\":"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":" \""},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"get"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"_current"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"_time"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\","},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":" \""},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"arguments"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\":"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":" {}"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"}\n"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\u003c/"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"tool"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"_call"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\u003e"},"finish_reason":null}]}

data: {"id":"chatcmpl-87","object":"chat.completion.chunk","created":1726104646,"model":"qwen2-7b-instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}

data: [DONE]

我使用了 new-api 进行转发调用返回格式与上面相同。

c121914yu commented 1 month ago

u003c 是个尖括号，不是花括号。你在 oneapi 是选 ollama 渠道模式么

RipperTs commented 1 month ago

是的，我切换了 OpenAI 和 Ollama 返回结果一致

LuckLittleBoy commented 1 month ago

确实是因为stream设置为true的问题，ollama不支持工具调用的流式响应，fastgpt非api调用时stream默认为true，等ollama后面版本支持streaming吧。联想截图_20240912094356

labring / FastGPT

使用ollama本地部署的glm4或llama3.1，无法使用工具调用 #2662

额外说明