labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
https://fastgpt.in
Other
16.57k stars 4.42k forks source link

xinference、glm4工具调用报错400 #1823

Closed JinCheng666 closed 5 days ago

JinCheng666 commented 2 months ago

例行检查

你的版本

问题描述, 日志截图

xinference部署glm-4-9b,通过oneapi接入fastgpt,使用glm4的对话功能正常,使用glm4的工具调用时,报错400

版本信息:

xinference:0.12.2 fastgpt:4.8.4-fix oneapi:0.6.6 glm4:glm-4-9b-chat

使用glm4的对话功能正常

image

使用glm4的工具调用时,报错400

image

config.json

 {
      "model": "glm-4-9b",
      "name": "glm-4-9b",
      "maxContext": 8192,
      "avatar": "/imgs/model/chatglm.svg",
      "maxResponse": 3000,
      "quoteMaxToken": 6000,
      "maxTemperature": 1.2,
      "charsPointsPrice": 0,
      "censor": false,
      "vision": false,
      "datasetProcess": false,
      "usedInClassify": true,
      "usedInExtractFields": true,
      "usedInToolCall": true,
      "usedInQueryExtension": true,
      "toolChoice": true,
      "functionCall": true,
      "customCQPrompt": "",
      "customExtractPrompt": "",
      "defaultSystemChatPrompt": "",
      "defaultConfig": {}
    },

oneapi报错日志

[SYS] 2024/06/22 - 17:21:29 | model ratio not found: glm-4-9b
[INFO] 2024/06/22 - 17:21:29 | 2024062217212958147780023082485 | user 1 has enough quota 999222410797, trusted and no need to pre-consume
[ERR] 2024/06/22 - 17:21:29 | 2024062217212958147780023082485 | relay error happen, status code is 400, won't retry in this case
[ERR] 2024/06/22 - 17:21:29 | 2024062217212958147780023082485 | relay error (channel #13): bad response status code 400
[GIN] 2024/06/22 - 17:21:29 | 2024062217212958147780023082485 | 400 |     10.0611ms |     10.4.134.11 |    POST /v1/chat/completions

fastgpt报错日志

{
  message: '400 bad response status code 400 (request id: 2024062217212958147780023082485)',
  stack: 'Error: 400 bad response status code 400 (request id: 2024062217212958147780023082485)\n' +
    '    at eL.generate (/app/projects/app/.next/server/chunks/76750.js:15:67594)\n' +
    '    at av.makeStatusError (/app/projects/app/.next/server/chunks/76750.js:15:79337)\n' +
    '    at av.makeRequest (/app/projects/app/.next/server/chunks/76750.js:15:80260)\n' +
    '    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' +
    '    at async w (/app/projects/app/.next/server/chunks/75612.js:309:2105)\n' +
    '    at async Object.w [as tools] (/app/projects/app/.next/server/chunks/75612.js:305:4790)\n' +
    '    at async k (/app/projects/app/.next/server/chunks/75612.js:313:2241)\n' +
    '    at async Promise.all (index 0)\n' +
    '    at async E (/app/projects/app/.next/server/chunks/75612.js:313:2782)\n' +
    '    at async h (/app/projects/app/.next/server/pages/api/core/chat/chatTest.js:1:3266)'
}

xinference报错

2024-06-22 17:25:16,414 xinference.core.supervisor 43237 DEBUG    Enter get_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f151234e2a0>, 'glm-4-9b'), kwargs: {}
2024-06-22 17:25:16,415 xinference.core.worker 43237 DEBUG    Enter get_model, args: (<xinference.core.worker.WorkerActor object at 0x7f15123c3e20>,), kwargs: {'model_uid': 'glm-4-9b-1-0'}
2024-06-22 17:25:16,415 xinference.core.worker 43237 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-22 17:25:16,415 xinference.core.supervisor 43237 DEBUG    Leave get_model, elapsed time: 0 s
2024-06-22 17:25:16,416 xinference.core.supervisor 43237 DEBUG    Enter describe_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7f151234e2a0>, 'glm-4-9b'), kwargs: {}
2024-06-22 17:25:16,416 xinference.core.worker 43237 DEBUG    Enter describe_model, args: (<xinference.core.worker.WorkerActor object at 0x7f15123c3e20>,), kwargs: {'model_uid': 'glm-4-9b-1-0'}
2024-06-22 17:25:16,416 xinference.core.worker 43237 DEBUG    Leave describe_model, elapsed time: 0 s
2024-06-22 17:25:16,416 xinference.core.supervisor 43237 DEBUG    Leave describe_model, elapsed time: 0 s
slot181 commented 1 month ago

我发现用 claude 也有这个问题。

fastgpt | message: '400 messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row (request id: 2024062912300280418215164311064)', fastgpt | stack: 'Error: 400 messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row (request id: 2024062912300280418215164311064)\n' + fastgpt | ' at eL.generate (/app/projects/app/.next/server/chunks/76750.js:15:67594)\n' + fastgpt | ' at av.makeStatusError (/app/projects/app/.next/server/chunks/76750.js:15:79337)\n' + fastgpt | ' at av.makeRequest (/app/projects/app/.next/server/chunks/76750.js:15:80260)\n' + fastgpt | ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' + fastgpt | ' at async Object.P [as chatNode] (/app/projects/app/.next/server/chunks/75612.js:312:686)\n' + fastgpt | ' at async k (/app/projects/app/.next/server/chunks/75612.js:313:2241)\n' + fastgpt | ' at async Promise.all (index 5)\n' + fastgpt | ' at async E (/app/projects/app/.next/server/chunks/75612.js:313:2782)\n' + fastgpt | ' at async C (/app/projects/app/.next/server/pages/api/v1/chat/completions.js:63:11920)\n' + fastgpt | ' at async /app/projects/app/.next/server/pages/api/core/app/list.js:1:5593' fastgpt | } fastgpt | [Info] 2024-06-29 10:30:03 Request finish /api/v1/chat/completions, time: 1195ms

c121914yu commented 1 month ago

我发现用 claude 也有这个问题。

fastgpt | message: '400 messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row (request id: 2024062912300280418215164311064)', fastgpt | stack: 'Error: 400 messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row (request id: 2024062912300280418215164311064)\n' + fastgpt | ' at eL.generate (/app/projects/app/.next/server/chunks/76750.js:15:67594)\n' + fastgpt | ' at av.makeStatusError (/app/projects/app/.next/server/chunks/76750.js:15:79337)\n' + fastgpt | ' at av.makeRequest (/app/projects/app/.next/server/chunks/76750.js:15:80260)\n' + fastgpt | ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' + fastgpt | ' at async Object.P [as chatNode] (/app/projects/app/.next/server/chunks/75612.js:312:686)\n' + fastgpt | ' at async k (/app/projects/app/.next/server/chunks/75612.js:313:2241)\n' + fastgpt | ' at async Promise.all (index 5)\n' + fastgpt | ' at async E (/app/projects/app/.next/server/chunks/75612.js:313:2782)\n' + fastgpt | ' at async C (/app/projects/app/.next/server/pages/api/v1/chat/completions.js:63:11920)\n' + fastgpt | ' at async /app/projects/app/.next/server/pages/api/core/app/list.js:1:5593' fastgpt | } fastgpt | [Info] 2024-06-29 10:30:03 Request finish /api/v1/chat/completions, time: 1195ms

claude 没支持函数调用

JinCheng666 commented 1 month ago

更新xinference到0.12.3,仍然出现这个问题

romejiang commented 1 month ago

更新xinference到0.12.3,仍然出现这个问题 将这两个参数改成false config.json

"toolChoice": false,
"functionCall": false,
hellolixy commented 1 month ago

解决了吗?我在xinference上部署了qw2-7b模型, 也是报这个错,但是dify上使用是正常的: message: '400 status code (no body)', stack: 'Error: 400 status code (no body)\n' + ' at APIError.generate (webpack-internal:///(api)/../../node_modules/.pnpm/openai@4.28.0_encoding@0.1.13/node_modul ror.mjs:57:20)\n' + ' at OpenAI.makeStatusError (webpack-internal:///(api)/../../node_modules/.pnpm/openai@4.28.0encoding@0.1.13/node ai/core.mjs:292:65)\n' + ' at OpenAI.makeRequest (webpack-internal:///(api)/../../node_modules/.pnpm/openai@4.28.0_encoding@0.1.13/node_modu ore.mjs:335:30)\n' + ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' + ' at async runToolWithToolChoice (webpack-internal:///(api)/../../packages/service/core/workflow/dispatch/agent/run ice.ts:95:24)\n' +

mojin504 commented 6 days ago

好多人遇到了这个问题,最近排查了一下。 400是参数格式不正确导致,工具调用在编排页面使用单步调试是没问题的,因为单步调试的stream参数为false,xinference或者openai api不会报错,如果直接对话或者整个应用调试,这时的stream参数为true,在传递了tools的情况下,接口会报400错误,这是因为模型选择tool的请求时,是不支持stream为true的。 @c121914yu 最新的fastgpt也有这个问题,麻烦确认下。在请求对话时,如果传递tools参数,则stream强制给false,后续将工具调用结果给大模型时,stream给true,是不是就可以了。

c121914yu commented 5 days ago

好多人遇到了这个问题,最近排查了一下。 400是参数格式不正确导致,工具调用在编排页面使用单步调试是没问题的,因为单步调试的stream参数为false,xinference或者openai api不会报错,如果直接对话或者整个应用调试,这时的stream参数为true,在传递了tools的情况下,接口会报400错误,这是因为模型选择tool的请求时,是不支持stream为true的。 @c121914yu 最新的fastgpt也有这个问题,麻烦确认下。在请求对话时,如果传递tools参数,则stream强制给false,后续将工具调用结果给大模型时,stream给true,是不是就可以了。

既然 GPT 可以设置为 true~ 为啥不是设置为 true 呢~ 这种应该改掉中间层,让他兼容 true 模式,即使是一次性返回。 xf 之前我看已经是支持 true 了。

JinCheng666 commented 5 days ago

xinference在0.13的某个版本上已经支持了,小版本号我记不清了。请更新到xinference最新版试试。 此问题将关闭