dstackai / dstack

dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, & TPU.
https://dstack.ai/docs
Mozilla Public License 2.0
1.6k stars 156 forks source link

[Bug]: Gateway error when streaming OpenAI responses #1563

Closed jvstme closed 1 day ago

jvstme commented 3 months ago

Steps to reproduce

Run a model with OpenAI interface.

> cat .dstack.yml
type: service

image: ollama/ollama
commands:
  - ollama serve &
  - sleep 3
  - ollama pull llama3.1
  - fg
port: 11434
spot_policy: auto

resources:
  gpu: 24GB

model:
  format: openai
  type: chat
  name: llama3.1

> dstack apply

Request the model via the gateway.* domain using "stream": true.

> curl https://gateway.mygateway.example/chat/completions -H 'Authorization: Bearer *****' -H 'Content-Type: application/json' -d '{"model":"llama3.1", "messages": [{"role":"user", "content":"Hi"}], "stream": true}'

Actual behaviour

Even though it does not seem to affect the response, there is an unhandled exception in gateway logs.

Aug 15 18:05:21 ip-172-31-20-96 sh[917]: INFO:     127.0.0.1:39066 - "POST /api/openai/main/chat/completions HTTP/1.0" 200 OK
Aug 15 18:05:22 ip-172-31-20-96 sh[917]: ERROR:    Exception in ASGI application
Aug 15 18:05:22 ip-172-31-20-96 sh[917]: Traceback (most recent call last):
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 265, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     await wrap(partial(self.listen_for_disconnect, receive))
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     await func()
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 238, in listen_for_disconnect
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     message = await receive()
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 534, in receive
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     await self.message_event.wait()
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     await fut
Aug 15 18:05:22 ip-172-31-20-96 sh[917]: asyncio.exceptions.CancelledError: Cancelled by cancel scope 75f0941d20b0
Aug 15 18:05:22 ip-172-31-20-96 sh[917]: During handling of the above exception, another exception occurred:
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   + Exception Group Traceback (most recent call last):
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     result = await app(  # type: ignore[func-returns-value]
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     return await self.app(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await super().__call__(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await self.middleware_stack(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     raise exc
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await self.app(scope, receive, _send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await self.app(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     raise exc
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await app(scope, receive, sender)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await self.middleware_stack(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await route.handle(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await self.app(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     raise exc
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await app(scope, receive, sender)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/routing.py", line 75, in app
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     await response(scope, receive, send)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 258, in __call__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     async with anyio.create_task_group() as task_group:
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   |     raise BaseExceptionGroup(
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:   +-+---------------- 1 ----------------
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     | Traceback (most recent call last):
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     await func()
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/starlette/responses.py", line 250, in stream_response
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     async for chunk in self.body_iterator:
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/dstack/gateway/openai/routes.py", line 51, in stream_chunks
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     async for chunk in chunks:
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/dstack/gateway/openai/clients/openai.py", line 30, in stream
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     async for data in self.client.stream_sse(
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/home/ubuntu/dstack/blue/lib/python3.10/site-packages/dstack/gateway/common.py", line 29, in stream_sse
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     yield json.loads(line[len("data:") :].strip("\n"))
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     return _default_decoder.decode(s)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |   File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     |     raise JSONDecodeError("Expecting value", s, err.value) from None
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     | json.decoder.JSONDecodeError: Expecting value: line 1 column 3 (char 2)
Aug 15 18:05:22 ip-172-31-20-96 sh[917]:     +------------------------------------

Expected behaviour

No unhandled exceptions.

dstack version

master

Server logs

No response

Additional information

No response

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 1 day ago

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.