Is your feature request related to a problem? Please describe
i see latest nightly has pull and list available like ollama - awesome.
allows me to use ollama list/pull.
any chance to trigger what corresponds to a ollama serve and have /api/generate and/or /api/embeddings work as a proxy so users nor apps don't need to lookup the random generated port number for the running service?
Describe the solution you'd like
have way to run models from api or command line and have a stable host/port to connect with a openai or similar genai serving api relaying to the underlying started container.
Is your feature request related to a problem? Please describe
i see latest nightly has pull and list available like ollama - awesome.
allows me to use ollama list/pull.
any chance to trigger what corresponds to a
ollama serve
and have /api/generate and/or /api/embeddings work as a proxy so users nor apps don't need to lookup the random generated port number for the running service?Describe the solution you'd like
have way to run models from api or command line and have a stable host/port to connect with a openai or similar genai serving api relaying to the underlying started container.
Describe alternatives you've considered
No response
Additional context
No response