[Feature]: Poll ollama for new endpoints

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.87k stars 1.5k forks source link

[Feature]: Poll ollama for new endpoints #979

Open krrishdholakia opened 10 months ago

krrishdholakia commented 10 months ago

The Feature

ollama exposes a /api/tags

poll it, if ollama models passed in to check if new models are available

Motivation, pitch

improve user lives

https://github.com/Luxadevi/Ollama-Companion

Twitter / LinkedIn details

No response

shuther commented 10 months ago

@krrishdholakia , I am confused how ollama is expected to work.

In the code, I found a variable called OLLAMA_API_BASE, but in the documentation we mention only api_base. Should we consider OLLAMA_API_BASE deprecated?
even if I specify an URL, my understanding is that the run_ollama_serve() function is still called so the ollama process is still launched in the same envirnomnent than litellm. Is there a way to avoid that? For scalability reasons I would prefer to split the load between litellm and the running models?

krrishdholakia commented 10 months ago

@shuther i believe the separation of concerns is noted in another issue - https://github.com/BerriAI/litellm/issues/969

OLLAMA_API_BASE is if you want to pass in the ollama api base via environment variables. It's a supported method of setting the api base. Thank you for pointing out that it was missing in our documentation.

qrkourier commented 9 months ago

Python solution for building a proxy config from the list of Ollama models may serve as one piece of what's needed for this issue.

import requests
import yaml
import copy

# Fetch the list of models
response = requests.get('http://ollama:11434/api/tags')
models = [model['name'] for model in response.json()['models']]

# Define the template
template = {
  "model_name": "MODEL",
  "litellm_params": {
    "model": "MODEL",
    "api_base": "http://ollama:11434",
    "stream": False
  }
}

# Build the model_list
model_list = []
for model in models:
    new_item = copy.deepcopy(template)
    new_item['model_name'] = model
    new_item['litellm_params']['model'] = f"ollama/{model}"
    model_list.append(new_item)

litellm_config = {
    "model_list": model_list
}
# Print the result
print(yaml.dump(litellm_config))

krrishdholakia commented 9 months ago

so we have a background health check functionality already - https://docs.litellm.ai/docs/proxy/health#background-health-checks

Perhaps for ollama, we could point it to call /api/tags instead, and that should serve as a way to update the model list