对接本地模型的openai接口，stream模式下没有返回

songquanpeng / one-api

OpenAI 接口管理 & 分发系统，支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元，可用于二次分发管理 key，仅单可执行文件，已打包好 Docker 镜像，一键部署，开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.

https://openai.justsong.cn/

MIT License

15.95k stars 3.69k forks source link

对接本地模型的openai接口，stream模式下没有返回 #1539

Open chenhb-zte opened 1 week ago

chenhb-zte commented 1 week ago

例行检查

[ ] 我已确认目前没有类似 issue
[ ] 我已确认我已升级到最新版本
[ ] 我已完整查看过项目 README，尤其是常见问题部分
[ ] 我理解并愿意跟进此 issue，协助测试和提供反馈
[ ] 我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 issue 可能会被无视或直接关闭

问题描述 使用的是fastgpt + oneapi +本地模型（openai 接口）, one api对接测试正常，在fastgpt访问提示“对话接口报错或返回为空”，查看oneapi日志显示completionTokens=0 ，通过curl接口测试openai接口正常，但是curl oneapi接口一样的问题 curl --location --request POST 'http://xx.xx.xx.xx:3001/v1/chat/completions' --header 'Authorization: Bearer sk-xxxxx' --header 'Content-Type: application/json' --data-raw '{ "model": "llama3-70b", "max_tokens": 2,"stream": true, "temperature": 1,"messages": [ { "role": "user", "content": "hi" } ] }'

复现步骤

预期结果

相关截图 如果没有的话，请删除此节。

USTCcgg commented 1 week ago

一样的问题直接curl本地模型流式为true，是正常的，有返回内容；通过one-api渠道接入，curl one-api的时候流式为false有返回内容，而流式是true的时候返回是空

USTCcgg commented 1 week ago

一样的问题直接curl本地模型流式为true，是正常的，有返回内容；通过one-api渠道接入，curl one-api的时候流式为false有返回内容，而流式是true的时候返回是空

使用的是Huggingface的Text Generation Inference 推理框架

ENg-122 commented 1 week ago

一样的问题直接curl本地模型流式为true，是正常的，有返回内容；通过one-api渠道接入，curl one-api的时候流式为false有返回内容，而流式是true的时候返回是空

使用的是Huggingface的Text Generation Inference 推理框架

同样遇到了是不是huggingface系列的都不支持啊

igophper commented 4 days ago

你本地的请求和响应方便给出来看一下吗

chenhb-zte commented 3 days ago

你本地的请求和响应方便给出来看一下吗

curl测试使用的是问题描述里的请求，没有响应信息输出。换成直接请求openai接口，是有stream打印输出的

igophper commented 2 days ago

你使用curl 去请求本地的模型接口，看看有没有问题

igophper commented 1 day ago

你本地的请求和响应方便给出来看一下吗 curl --location '10.81.1.66:3001/v1/chat/completions' --header 'Content-Type: application/json' --header 'Accept: text/event-stream' --header 'Authorization: Bearer sk-dyjZYJ8xdzcFPp8y5597E57eA5354a808bE82dC4D1982515' --data '{ "model": "qwen2-72b-local", "stream": true, "messages": [ { "role": "user", "content": "1+98等于几" } ] }' Qwen2-72b部署在本地TGI上，上图是走oneAPI的接口，stream=true无法正常返回，stream=false正常 ps：不通过oneapi，直接访问模型，stream=true/false都正常

你请求本地TGI的请求和响应，还有one-api的配置可以发一下吗

ludevica commented 1 day ago

你使用curl 去请求本地的模型接口，看看有没有问题

curl --location '10.81.1.66:3001/v1/chat/completions' --header 'Content-Type: application/json' --header 'Accept: text/event-stream' --header 'Authorization: Bearer sk-dyjZYJ8xdzcFPp8y5597E57eA5354a808bE82dC4D1982515' --data '{ "model": "qwen2-72b-local", "stream": true, "messages": [ { "role": "user", "content": "1+98等于几" } ] }' Qwen2-72b部署在本地TGI上，上图是走oneAPI的接口，stream=true无法正常返回，stream=false正常 ps：不通过oneapi，直接访问模型，stream=true/false都正常

igophper commented 1 day ago

你使用curl 去请求本地的模型接口，看看有没有问题

curl --location '10.81.1.66:3001/v1/chat/completions' --header 'Content-Type: application/json' --header 'Accept: text/event-stream' --header 'Authorization: Bearer sk-dyjZYJ8xdzcFPp8y5597E57eA5354a808bE82dC4D1982515' --data '{ "model": "qwen2-72b-local", "stream": true, "messages": [ { "role": "user", "content": "1+98等于几" } ] }' Qwen2-72b部署在本地TGI上，上图是走oneAPI的接口，stream=true无法正常返回，stream=false正常 ps：不通过oneapi，直接访问模型，stream=true/false都正常

所以这是curl请求本地的模型，不是请求one-api的。那你的响应也可以贴一下。还有one-api是怎么配置的。

ludevica commented 1 day ago

你使用curl 去请求本地的模型接口，看看有没有问题

curl --location '10.81.1.66:3001/v1/chat/completions' --header 'Content-Type: application/json' --header 'Accept: text/event-stream' --header 'Authorization: Bearer sk-dyjZYJ8xdzcFPp8y5597E57eA5354a808bE82dC4D1982515' --data '{ "model": "qwen2-72b-local", "stream": true, "messages": [ { "role": "user", "content": "1+98等于几" } ] }' Qwen2-72b部署在本地TGI上，上图是走oneAPI的接口，stream=true无法正常返回，stream=false正常 ps：不通过oneapi，直接访问模型，stream=true/false都正常

所以这是curl请求本地的模型，不是请求one-api的。那你的响应也可以贴一下。还有one-api是怎么配置的。

上面10.81.1.66:3001的是one-api的接口，stream=true无返回。下图：直接访问TGI部署的Qwen2 返回结果：