Feature Request: Add Ollama-compatible APIs

I'm setting up Open WebUI as the web frontend for internal use of LLMs, complete with HA + SSO setup and all. GPUStack is used as an OpenAI-compatible backend and it works great for web-based access.

I'm also exploring ways to make it easy for our devs to get API access to the same wide variety of LLMs. Open WebUI exposes an /ollama API endpoint to passthrough requests to the Ollama-compatible backends, after authenticating requests. It doesn't support this for OpenAI-compatible backends. Any chance you'll consider to add Ollama-compatible APIs to GPUStack?

That would make this use-case easier to realize, by still having only one entrypoint (Open WebUI) where authentication is handled before passthrough of requests to GPUStack.

gpustack / gpustack

Feature Request: Add Ollama-compatible APIs #299