KenyonY / openai-forward

🚀 大语言模型高效转发服务 · An efficient forwarding service designed for LLMs. · OpenAI API Reverse Proxy
https://api.openai-forward.com
MIT License
839 stars 288 forks source link

使用gpt4,不使用流式返回,提交长文本之后,会返回(504 Gateway Time-out),代码把httpx超时和转发的超时都设置成120也不管用 #119

Closed XianYue0125 closed 6 months ago

XianYue0125 commented 8 months ago

初始检查

问题描述

提示词大概2700tokens,使用的是langchain的包

from langchain_openai import ChatOpenAI

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

try: chat_model = ChatOpenAI( model='gpt-4-1106-preview', openai_api_key='', openai_api_base='',

streaming=True,

    #callbacks=[StreamingStdOutCallbackHandler()],
    # temperature=0.2
)

image 代码设置的超时,

image 转发服务器设置的超时,

配置/代码示例和输出

504 Gateway Time-out

504 Gateway Time-out


nginx/1.18.0 (Ubuntu)

我的见解

不知道问题出在哪里,搜索项目也没有找到nginx,看readme里面有proxy_buffering off; 但是没有在env里面找到配置的地方,项目的代码里面搜索了也没找到

环境

python: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] OS: Windows

最后一步

XianYue0125 commented 8 months ago

另一个问题,使用流式返回的时候,会在不确定的时间报错,报错内容是 httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

image 我也修改了nginx的文件,但依然不行

也一直没排查出来问题,我对通信这块不太懂,求大佬指点,感谢

KenyonY commented 8 months ago

openai-forward运行时的报错日志吗?

XianYue0125 commented 8 months ago

log文件夹下,openai里面的四个文件夹的log都是空的,.env里面都设置的true但还是没有日志

XianYue0125 commented 8 months ago

image 我猜测原因可能是我使用了nohup命令,log都打印到了nohup.out里面,这是截图,其他部分都是正常的

KenyonY commented 8 months ago
  1. 如果不用nginx转发8000端口,是否可以正常使用?
  2. 只是当gpt4 + 非流式 才会超时吗? 也就是其它时候都是正常使用的
  3. 建议使用requests直接对 v1/chat/completions接口进行测试
  4. https://api.openai-forward.com/v1/chat/completions 这个服务是否也有类似问题?
XianYue0125 commented 8 months ago

不用nginx转发8000端口也无法正常使用 使用gpt4+流式的时候,经常会超时,信息量大概是,发送tokens3000,返回tokens1500左右 不是langchain的问题,直接使用openai官方的接口走转发也会出现同样的问题

gpt4+流式的时候,报错信息通常是peer closed connection without sending complete message body gpt4+非流式的时候,返回504 Gateway Time-out,nginx/1.18.0 (Ubuntu)

另外,使用其他的php代码跑转发服务,跑同样的gpt代码没有出现问题

XianYue0125 commented 8 months ago

也尝试过ip地址加端口访问,http的xxx.xxx.xxx.xxx:8000/v1这种,问题依然存在

XianYue0125 commented 8 months ago

openai的api

from openai import OpenAI client = OpenAI(api_key='', base_url='') completion = client.chat.completions.create( model="gpt-4-1106-preview",

model="gpt-3.5-turbo",

    messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
],
    stream=True
)

langchain的api

from langchain_openai import ChatOpenAI from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

chat_model = ChatOpenAI( model='gpt-4-1106-preview',

model="gpt-3.5-turbo",

openai_api_key='',
openai_api_base='',
streaming=True,
callbacks=[StreamingStdOutCallbackHandler()],
#temperature=0.2

)

KenyonY commented 8 months ago

你能否测试一下将base_url设置为

https://api.openai-forward.com 或者 https://render.openai-forward.com/

是否也有类似问题?

XianYue0125 commented 8 months ago

我尝试过直接本地挂梯子然后不设置url,直接走官方的url,没有问题

XianYue0125 commented 8 months ago

我目前没有官方的key了,测试不了了

我现在的通信流程是先走云服务器,也就是部署好的openai-forward端口,然后本地转发到另一个端口,key也会映射成azure的key,然后另一个端口会把openai 的请求转换成azure的openai请求然后直接发送到azure。

收到的错误都是在openai-forward这边,还没有数据转发到另一个端口就报错了,另外转发成azure的github项目叫azure-openai-proxy,是封装好的容器

XianYue0125 commented 8 months ago

image 有时候返回到一半就不行了

XianYue0125 commented 8 months ago

image 发现一个问题,旧的env文件的LOG_OPENAI和新的OPENAI_LOG不一样,导致设置日志为true不生效,这个改了,让打印出来了日志,偶尔会有上面这个报错

XianYue0125 commented 8 months ago
      │    └ <class 'functools.partial'>
      └ <function StreamingResponse.__call__.<locals>.wrap at 0x7fdcf0347b50>

File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/responses.py", line 257, in wrap await func() └ functools.partial(<bound method StreamingResponse.listen_for_disconnect of <starlette.middleware.base._StreamingResponse obje... File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/responses.py", line 234, in listen_for_disconnect message = await receive() └ <bound method _CachedRequest.wrapped_receive of <starlette.middleware.base._CachedRequest object at 0x7fdcf0342e00>> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/middleware/base.py", line 52, in wrapped_receive msg = await self.receive() │ └ <property object at 0x7fdd039bd120> └ <starlette.middleware.base._CachedRequest object at 0x7fdcf0342e00> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 534, in receive await self.message_event.wait() │ │ └ <function Event.wait at 0x7fdd05097910> │ └ <asyncio.locks.Event object at 0x7fdcf0341c30 [unset]> └ <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0> File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait await fut └

asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fdcf03a2b00

During handling of the above exception, another exception occurred:

During handling of the above exception, another exception occurred:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/var/www/openai-forward/myenv/bin/aifd", line 8, in sys.exit(main()) │ │ └ <function main at 0x7fdd055fd870> │ └ └ <module 'sys' (built-in)> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/main.py", line 177, in main fire.Fire(Cli) │ │ └ <class 'openai_forward.main.Cli'> │ └ <function Fire at 0x7fdd05391d80> └ <module 'fire' from '/var/www/openai-forward/myenv/lib/python3.10/site-packages/fire/init.py'> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) │ │ │ │ │ └ 'aifd' │ │ │ │ └ {} │ │ │ └ Namespace(verbose=False, interactive=False, separator='-', completion=None, help=False, trace=False) │ │ └ ['run'] │ └ <class 'openai_forward.main.Cli'> └ <function _Fire at 0x7fdd04a96e60> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( │ └ <function _CallAndUpdateTrace at 0x7fdd04a96f80> └ <bound method Cli.run of <openai_forward.main.Cli object at 0x7fdd0494a530>> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, *kwargs) │ │ └ {} │ └ [8000, 1, False, 8001] └ <bound method Cli.run of <openai_forward.main.Cli object at 0x7fdd0494a530>> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/main.py", line 35, in run uvicorn.run( │ └ <function run at 0x7fdd04a3dc60> └ <module 'uvicorn' from '/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/init.py'> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/main.py", line 587, in run server.run() │ └ <function Server.run at 0x7fdd04a3d6c0> └ <uvicorn.server.Server object at 0x7fdd049496f0> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/server.py", line 62, in run return asyncio.run(self.serve(sockets=sockets)) │ │ │ │ └ None │ │ │ └ <function Server.serve at 0x7fdd04a3d750> │ │ └ <uvicorn.server.Server object at 0x7fdd049496f0> │ └ <function run at 0x7fdd05280f70> └ <module 'asyncio' from '/usr/lib/python3.10/asyncio/init.py'> File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) │ │ └ <coroutine object Server.serve at 0x7fdd0493c350> │ └ <function BaseEventLoop.run_until_complete at 0x7fdd050af880> └ <_UnixSelectorEventLoop running=True closed=False debug=False> File "/usr/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete self.run_forever() │ └ <function BaseEventLoop.run_forever at 0x7fdd050af7f0> └ <_UnixSelectorEventLoop running=True closed=False debug=False> File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever self._run_once() │ └ <function BaseEventLoop._run_once at 0x7fdd050bd360> └ <_UnixSelectorEventLoop running=True closed=False debug=False> File "/usr/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once handle._run() │ └ <function Handle._run at 0x7fdd05348d30> └ <Handle Task.task_wakeup()> File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run self._context.run(self._callback, self._args) │ │ │ │ │ └ <member '_args' of 'Handle' objects> │ │ │ │ └ <Handle Task.task_wakeup()> │ │ │ └ <member '_callback' of 'Handle' objects> │ │ └ <Handle Task.task_wakeup()> │ └ <member '_context' of 'Handle' objects> └ <Handle Task.task_wakeup()>

File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 404, in run_asgi result = await app( # type: ignore[func-returns-value] └ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0x7fdd0494a2c0> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call return await self.app(scope, receive, send) │ │ │ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.0', 'server': ('127.0.0.1', 8000), 'cl... │ └ <fastapi.applications.FastAPI object at 0x7fdcf10ebc70> └ <uvicorn.middleware.proxy_headers.ProxyHeadersMiddleware object at 0x7fdd0494a2c0> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) │ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.0', 'server': ('127.0.0.1', 8000), 'cl... File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/applications.py", line 123, in call await self.middleware_stack(scope, receive, send) │ │ │ │ └ <bound method RequestResponseCycle.send of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.0', 'server': ('127.0.0.1', 8000), 'cl... │ └ <starlette.middleware.errors.ServerErrorMiddleware object at 0x7fdcf11201f0> └ <fastapi.applications.FastAPI object at 0x7fdcf10ebc70> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) │ │ │ │ └ <function ServerErrorMiddleware.call.._send at 0x7fdcf1105f30> │ │ │ └ <bound method RequestResponseCycle.receive of <uvicorn.protocols.http.h11_impl.RequestResponseCycle object at 0x7fdcf0342fb0>> │ │ └ {'type': 'http', 'asgi': {'version': '3.0', 'spec_version': '2.3'}, 'http_version': '1.0', 'server': ('127.0.0.1', 8000), 'cl... │ └ <starlette.middleware.base.BaseHTTPMiddleware object at 0x7fdcf10e8a00> └ <starlette.middleware.errors.ServerErrorMiddleware object at 0x7fdcf11201f0> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/middleware/base.py", line 189, in call with collapse_excgroups(): └ <function collapse_excgroups at 0x7fdd0399ab90> File "/usr/lib/python3.10/contextlib.py", line 153, in exit self.gen.throw(typ, value, traceback) │ │ │ │ │ └ <traceback object at 0x7fdcf03c0780> │ │ │ │ └ ExceptionGroup('unhandled errors in a TaskGroup', [ExceptionGroup('unhandled errors in a TaskGroup', [ExceptionGroup('unhandl... │ │ │ └ <class 'exceptiongroup.ExceptionGroup'> │ │ └ <method 'throw' of 'generator' objects> │ └ <generator object collapse_excgroups at 0x7fdcf1110970> └ <contextlib._GeneratorContextManager object at 0x7fdcf0343e80> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/_utils.py", line 91, in collapse_excgroups raise exc File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/responses.py", line 257, in wrap await func() └ functools.partial(<bound method StreamingResponse.stream_response of <starlette.responses.StreamingResponse object at 0x7fdcf... File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/starlette/responses.py", line 246, in stream_response async for chunk in self.body_iterator: │ │ └ <async_generator object OpenaiForward.aiter_bytes at 0x7fdcf0330dc0> │ └ <starlette.responses.StreamingResponse object at 0x7fdcf03a2560> └ b'\n' File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/decorators.py", line 157, in wrapper async for value in async_gen: │ └ <async_generator object OpenaiForward.aiter_bytes at 0x7fdcf0331940> └ b'\n' File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/forward/core.py", line 422, in aiter_bytes cache_response(cache_key, target_info, route_path, chunk_list) │ │ │ │ └ [b'data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"ha... │ │ │ └ '/v1/chat/completions' │ │ └ {} │ └ b'\x98\x01\x92\x82\xa4role\xa6system\xa7content\xbcYou are a helpful assistant.\x82\xa4role\xa4user\xa7content\xda\x1b\x9b\xe... └ <function cache_response at 0x7fdcf1070dc0> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/cache/init.py", line 58, in cache_response cache_generic_response(cache_key, route_path, chunk_list) │ │ │ └ [b'data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"ha... │ │ └ '/v1/chat/completions' │ └ b'\x98\x01\x92\x82\xa4role\xa6system\xa7content\xbcYou are a helpful assistant.\x82\xa4role\xa4user\xa7content\xda\x1b\x9b\xe... └ <function cache_generic_response at 0x7fdcf1070e50> File "/var/www/openai-forward/myenv/lib/python3.10/site-packages/openai_forward/cache/init.py", line 62, in cache_generic_response if cache_key and route_path in CACHE_ROUTE_SET: │ │ └ {'/v1/chat/completions', '/v1/embeddings'} │ └ [b'data: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"ha... └ b'\x98\x01\x92\x82\xa4role\xa6system\xa7content\xbcYou are a helpful assistant.\x82\xa4role\xa4user\xa7content\xda\x1b\x9b\xe...

TypeError: unhashable type: 'list'

KenyonY commented 8 months ago

建议你先将openai-forward 升级到最新:

pip install git+https://github.com/KenyonY/openai-forward

然后按照 .env 进行正确配置,cache相关的可以先注释掉,测试是否还有问题。 PS: 你的这些问题在我部署的服务中没有遇到过。

XianYue0125 commented 8 months ago

image 使用0.7.2之后运行报了这个错误,跑不起来服务,env文件已经是新的了

XianYue0125 commented 8 months ago

0.7.1也是上面的报错,0.7.0可用

XianYue0125 commented 8 months ago

image 这是目前的.env文件配置

KenyonY commented 8 months ago

已修复仓库的.env 配置示例,是其中的OPENAI_API_KEY正确使用方式是

OPENAI_API_KEY={"sk-xxx": [0], "sk-xxx": [1], "sk-xxx": [1,2]}

感谢反馈

XianYue0125 commented 8 months ago

我按照新的.env格式配置了参数,0.7.1和0.7.2都能正常跑了

这次测试,openai-forward没有异常信息,但程序还是会有peer closed connection without sending complete message body (incomplete chunked read),我想再排查一下nginx和微软返回的信息,有进展了第一时间回复

XianYue0125 commented 8 months ago

直接使用ip访问,问题依然存在,不是nginx的问题,目前还不清楚是哪里的问题,也许是返回给openai-forward的信息有问题,我想要查看返回的信息应该从哪里看,chat的log里面只能看到发送的,看不到返回的

KenyonY commented 8 months ago

直接使用ip访问,问题依然存在,不是nginx的问题,目前还不清楚是哪里的问题,也许是返回给openai-forward的信息有问题,我想要查看返回的信息应该从哪里看,chat的log里面只能看到发送的,看不到返回的

那得修改源码了: 你可以在这里 https://github.com/KenyonY/openai-forward/blob/8322a46bef74ae9465efcb7e08e561c60be6cf6b/openai_forward/forward/core.py#L529 的上一行去打印每次返回的chunk

XianYue0125 commented 8 months ago

好的,感谢,周一会仔细研究一下,有了消息第一时间回复

XianYue0125 commented 8 months ago

您好,我运行源代码,打印返回内容,然后发现了问题 image 在这里返回了不完整的数据,然后代码就报错,peer closed connection without sending complete message body

XianYue0125 commented 8 months ago

我在网上查了下,流式返回了不完整的信息的时候,可能需要存起来最后一行,然后等下一次消息收到的时候拼起来再返回完整的内容

我在使用langchain或者openai的api时都出现了peer closed connection without sending complete message body,猜测可能这些包默认都是要求返回的信息完整,所以当openai-forward返回了不完整的json时,代码会报错

XianYue0125 commented 8 months ago

image 这里似乎也是类似的错误但不太一样,json的内容有问题,导致程序收到之后就报错了

XianYue0125 commented 8 months ago

image 返回这些消息之后也报错了

然后蓝色文字是程序收到不完整信息之后报错,又自动重发了请求

XianYue0125 commented 8 months ago

image 运行完之后会稳定报这个错,是什么原因

KenyonY commented 8 months ago
  1. 原则上 openai-forward 会对接受到的chunk进行不做任何修改的返回,所以它直到在结束时收到的是一份不完整的返回体,这里我理解是上游返回出了问题,但你之前说使用别的转发没有问题,这很奇怪。
  2. 日志包IndexError还是由于接收到的response不完整
  3. openai-forward只会在发送不成功时重试,你的程序收到不完整信息后重试应该是OpenAI()或Langchain的自动重试
XianYue0125 commented 8 months ago

问题应该就出在上游返回的消息不完整了,使用别的转发没问题,是因为别的转发会把json拼完整再返回,如果不做处理直接返回给程序的话,那么程序使用的openai或是langchain都会出现因为接受内容不完整而报错的问题

KenyonY commented 8 months ago

@XianYue0125 我刚刚提交了一个pr #121,内部对stream与非stream请求做了区分,你试试用这个分支部署看能否解决问题

github-actions[bot] commented 7 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 6 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.

XianYue0125 commented 6 months ago

抱歉一直在忙其他时间忘记了跟进这个分支。

我这边通过测试得出了结论,使用流式返回的时候,因为使用的是azure的服务,在达到每分钟最大的tokens时,azure端会自动断掉流式返回的对话,导致接收端收不到完整的信息,然后报错peer closed connection without sending complete message body。

关于这点,可以肯定不是转发的问题,因为尝试了其他的转发服务也有类似的情况,即使是直接使用openai官方的服务器和api,在达到上限的时候也会出现同样的错误。

有一些作者会在接收端对这个错误进行额外的处理,我在想有没有可能做到转发服务器里面,多一个服务端关闭连接的处理,提醒客户端发送频率超额了等信息。