BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.07k stars 1.66k forks source link

[Bug]: WebSocket issues with Open AI Realtime API in the browser #6825

Open mirodrr2 opened 2 days ago

mirodrr2 commented 2 days ago

What happened?

Has anyone managed to get this project working with LiteLLM? https://github.com/openai/openai-realtime-console

It's a React app that call the Open AI Real time voice API directly from the browser via a websocket. If you look under the hood, it is calling the websocket like this:

const WebSocket = globalThis.WebSocket;
const ws = new WebSocket(wsUrl, [
          'realtime',
          `openai-insecure-api-key.${this.apiKey}`,
          'openai-beta.realtime-v1',
        ]);

The behavior is that the client app is able to reach the /v1/realtime endpoint, and the server is able to run await websocket.accept(), but the connection is instantly closed (seemingly from the browser's end) with a generic 1006 error. I've added a ton of logs on both the client and server, but haven't been able to get anything that gets me closer to a solution

I've gotten this exact code to work against LiteLlm in a javascript app outside the browser, but no matter what I do it does not work in the browser. I am running LiteLLM on AWS on ECS behind an Application Load Balancer.

The frontend code works fine if you call the Open AI API directly

It's unclear to me whether this is a LiteLLM issue, or an Application Load Balancer issue or an ECS issue. I've only ever tried to host LiteLLM this way, and this is the first issue I've encountered after doing it this way for a while with the rest of the LiteLLM APIs

I am not using the relay server option provided by the repo, as having two proxies is something I want to avoid

Relevant log output

No response

Twitter / LinkedIn details

No response