Closed sabaimran closed 5 months ago
Hi @sabaimran I believe I can help with this issue. I’m the maintainer of LiteLLM https://github.com/BerriAI/litellm - we allow you to use any LLM as a drop in replacement for gpt-3.5-turbo
.
You can use LiteLLM in the following ways:
This calls the provider API directly
from litellm import completion
import os
## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-key" #
os.environ["COHERE_API_KEY"] = "your-key" #
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
this is great if you don’t have access to claude but want to use the open source LiteLLM proxy to access claude
from litellm import completion
import os
## set ENV variables
os.environ["OPENAI_API_KEY"] = "sk-litellm-5b46387675a944d2" # [OPTIONAL] replace with your openai key
os.environ["COHERE_API_KEY"] = "sk-litellm-5b46387675a944d2" # [OPTIONAL] replace with your cohere key
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
Hey @sabaimran @ishaan-jaff any updates on this?
Hey @ishaan-jaff , @krrishdholakia , what does this offer over our current setup? In the example where we'd use our own API key, what would be the benefit of that for using LiteLLM vs. just calling OpenAI directly?
We're planning to host the proxy server on our own infrastructure. Some dev tools for hosting options:
We'll prioritize support for the best available open source models.
We have an open-source proxy code that might help - https://github.com/BerriAI/liteLLM-proxy/blob/main/main.py
It seems like that's what you're trying to build - i.e. a server that sits in front of your LLMs (self-deployed + openai/anthropic/et.) and makes the API calls for you.
Huggingface TGI, Anthropic, OpenAI have different input params, and output formats.
LiteLLM simplifies that by keeping them all consistent with the OpenAI format.
@sabaimran What is litellm missing to be useful to you? Any feedback here would be helpful.
Hey guys! Closing the loop here. We're not going to setup our own inference server, but litellm would be the proxy server of choice for when we are using additional models that can leverage an OpenAI API-compatible interface. We would most likely set this up in a private codebase for ease of use. Have had a good dev experience with the litellm proxy server via docker. Thanks, @krrishdholakia & @ishaan-jaff !
@sabaimran how could the docker spin up process have been easier? working on improving the quick start flow this week
Setup an inference server which gives access to several different models. Include access to:
Specification: