BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.87k stars 1.5k forks source link

[Feature]: Poll ollama for new endpoints #979

Open krrishdholakia opened 10 months ago

krrishdholakia commented 10 months ago

The Feature

ollama exposes a /api/tags

poll it, if ollama models passed in to check if new models are available

Motivation, pitch

improve user lives

https://github.com/Luxadevi/Ollama-Companion

Twitter / LinkedIn details

No response

shuther commented 10 months ago

@krrishdholakia , I am confused how ollama is expected to work.

krrishdholakia commented 10 months ago

@shuther i believe the separation of concerns is noted in another issue - https://github.com/BerriAI/litellm/issues/969

OLLAMA_API_BASE is if you want to pass in the ollama api base via environment variables. It's a supported method of setting the api base. Thank you for pointing out that it was missing in our documentation.

Screenshot 2023-12-04 at 8 51 52 AM
qrkourier commented 9 months ago

Python solution for building a proxy config from the list of Ollama models may serve as one piece of what's needed for this issue.

import requests
import yaml
import copy

# Fetch the list of models
response = requests.get('http://ollama:11434/api/tags')
models = [model['name'] for model in response.json()['models']]

# Define the template
template = {
  "model_name": "MODEL",
  "litellm_params": {
    "model": "MODEL",
    "api_base": "http://ollama:11434",
    "stream": False
  }
}

# Build the model_list
model_list = []
for model in models:
    new_item = copy.deepcopy(template)
    new_item['model_name'] = model
    new_item['litellm_params']['model'] = f"ollama/{model}"
    model_list.append(new_item)

litellm_config = {
    "model_list": model_list
}
# Print the result
print(yaml.dump(litellm_config))
krrishdholakia commented 9 months ago

so we have a background health check functionality already - https://docs.litellm.ai/docs/proxy/health#background-health-checks

Perhaps for ollama, we could point it to call /api/tags instead, and that should serve as a way to update the model list