Open arnesund opened 1 day ago
OpenAI-compatible APIs are widely adopted in modern inference engines(llama.cpp, vllm, sglang, etc.) and local serving tools(lm studio, localai, etc.). I think it makes more sense if existing authentication in Open WebUI can work with any OpenAI-compatible backends.
I'm setting up Open WebUI as the web frontend for internal use of LLMs, complete with HA + SSO setup and all. GPUStack is used as an OpenAI-compatible backend and it works great for web-based access.
I'm also exploring ways to make it easy for our devs to get API access to the same wide variety of LLMs. Open WebUI exposes an /ollama API endpoint to passthrough requests to the Ollama-compatible backends, after authenticating requests. It doesn't support this for OpenAI-compatible backends. Any chance you'll consider to add Ollama-compatible APIs to GPUStack?
That would make this use-case easier to realize, by still having only one entrypoint (Open WebUI) where authentication is handled before passthrough of requests to GPUStack.