BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.71k stars 1.48k forks source link

[Bug]: `model_list` missing entries? #2803

Closed foragerr closed 6 months ago

foragerr commented 6 months ago

What happened?

This snippet works

response = litellm.completion(
    model="gemini/gemini-pro", 
    messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}]
)

but litellm.model_list response does not contain gemini/gemini-pro

Relevant log output

for litellm==1.34.21, litellm.model_list returns

[
  "gpt-4",
  "gpt-4-turbo-preview",
  "gpt-4-0314",
  "gpt-4-0613",
  "gpt-4-32k",
  "gpt-4-32k-0314",
  "gpt-4-32k-0613",
  "gpt-4-1106-preview",
  "gpt-4-0125-preview",
  "gpt-4-vision-preview",
  "gpt-4-1106-vision-preview",
  "gpt-3.5-turbo",
  "gpt-3.5-turbo-0301",
  "gpt-3.5-turbo-0613",
  "gpt-3.5-turbo-1106",
  "gpt-3.5-turbo-0125",
  "gpt-3.5-turbo-16k",
  "gpt-3.5-turbo-16k-0613",
  "ft:gpt-3.5-turbo",
  "text-embedding-3-large",
  "text-embedding-3-small",
  "text-embedding-ada-002",
  "text-embedding-ada-002-v2",
  "text-moderation-stable",
  "text-moderation-007",
  "text-moderation-latest",
  "256-x-256/dall-e-2",
  "512-x-512/dall-e-2",
  "1024-x-1024/dall-e-2",
  "hd/1024-x-1792/dall-e-3",
  "hd/1792-x-1024/dall-e-3",
  "hd/1024-x-1024/dall-e-3",
  "standard/1024-x-1792/dall-e-3",
  "standard/1792-x-1024/dall-e-3",
  "standard/1024-x-1024/dall-e-3",
  "whisper-1",
  "azure/gpt-3.5-turbo-instruct-0914",
  "azure/gpt-35-turbo-instruct",
  "babbage-002",
  "davinci-002",
  "gpt-3.5-turbo-instruct",
  "gpt-3.5-turbo-instruct-0914",
  "command-nightly",
  "command",
  "command-medium-beta",
  "command-xlarge-beta",
  "command-r",
  "command-light",
  "claude-instant-1",
  "claude-instant-1.2",
  "claude-2",
  "claude-2.1",
  "claude-3-haiku-20240307",
  "claude-3-opus-20240229",
  "claude-3-sonnet-20240229",
  "replicate/llama-2-70b-chat:2796ee9483c3fd7aa2e171d38f4ca12251a30609463dcfd4cd76703f22e96cdf",
  "a16z-infra/llama-2-13b-chat:2a7f981751ec7fdf87b5b91ad4db53683a98082e9ff7bfd12c8cd5ea85980a52",
  "meta/codellama-13b:1c914d844307b0588599b8393480a3ba917b660c7e9dfae681542b5325f228db",
  "replicate/vicuna-13b:6282abe6a492de4145d7bb601023762212f9ddbbe78278bd6771c8b3b2f2a13b",
  "joehoover/instructblip-vicuna13b:c4c54e3c8c97cd50c2d2fec9be3b6065563ccf7d43787fb99f84151b867178fe",
  "daanelson/flan-t5-large:ce962b3f6792a57074a601d3979db5839697add2e4e02696b3ced4c022d4767freplicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5",
  "replit/replit-code-v1-3b:b84f4c074b807211cd75e3e8b1589b6399052125b4c27106e43d47189e8415ad",
  "openrouter/openai/gpt-3.5-turbo",
  "openrouter/openai/gpt-3.5-turbo-16k",
  "openrouter/openai/gpt-4",
  "openrouter/anthropic/claude-instant-v1",
  "openrouter/anthropic/claude-2",
  "openrouter/google/palm-2-chat-bison",
  "openrouter/google/palm-2-codechat-bison",
  "openrouter/meta-llama/llama-2-13b-chat",
  "openrouter/meta-llama/llama-2-70b-chat",
  "openrouter/meta-llama/codellama-34b-instruct",
  "openrouter/nousresearch/nous-hermes-llama2-13b",
  "openrouter/mancer/weaver",
  "openrouter/gryphe/mythomax-l2-13b",
  "openrouter/jondurbin/airoboros-l2-70b-2.1",
  "openrouter/undi95/remm-slerp-l2-13b",
  "openrouter/pygmalionai/mythalion-13b",
  "openrouter/mistralai/mistral-7b-instruct",
  "openrouter/mistralai/mistral-7b-instruct:free",
  "meta-llama/Llama-2-7b-hf",
  "meta-llama/Llama-2-7b-chat-hf",
  "meta-llama/Llama-2-13b-hf",
  "meta-llama/Llama-2-13b-chat-hf",
  "meta-llama/Llama-2-70b-hf",
  "meta-llama/Llama-2-70b-chat-hf",
  "meta-llama/Llama-2-7b",
  "meta-llama/Llama-2-7b-chat",
  "meta-llama/Llama-2-13b",
  "meta-llama/Llama-2-13b-chat",
  "meta-llama/Llama-2-70b",
  "meta-llama/Llama-2-70b-chat",
  "chat-bison",
  "chat-bison@001",
  "chat-bison@002",
  "chat-bison-32k",
  "text-bison",
  "text-bison@001",
  "text-unicorn",
  "text-unicorn@001",
  "j2-ultra",
  "j2-mid",
  "j2-light",
  "togethercomputer/llama-2-70b-chat",
  "togethercomputer/llama-2-70b",
  "togethercomputer/LLaMA-2-7B-32K",
  "togethercomputer/Llama-2-7B-32K-Instruct",
  "togethercomputer/llama-2-7b",
  "togethercomputer/falcon-40b-instruct",
  "togethercomputer/falcon-7b-instruct",
  "togethercomputer/alpaca-7b",
  "HuggingFaceH4/starchat-alpha",
  "togethercomputer/CodeLlama-34b",
  "togethercomputer/CodeLlama-34b-Instruct",
  "togethercomputer/CodeLlama-34b-Python",
  "defog/sqlcoder",
  "NumbersStation/nsql-llama-2-7B",
  "WizardLM/WizardCoder-15B-V1.0",
  "WizardLM/WizardCoder-Python-34B-V1.0",
  "NousResearch/Nous-Hermes-Llama2-13b",
  "Austism/chronos-hermes-13b",
  "upstage/SOLAR-0-70b-16bit",
  "WizardLM/WizardLM-70B-V1.0",
  "qvv0xeq",
  "q841o8w",
  "31dxrj3",
  "luminous-base",
  "luminous-base-control",
  "luminous-extended",
  "luminous-extended-control",
  "luminous-supreme",
  "luminous-supreme-control",
  "dolphin",
  "chatdolphin",
  "llama2",
  "ai21.j2-mid-v1",
  "ai21.j2-ultra-v1",
  "amazon.titan-text-lite-v1",
  "amazon.titan-text-express-v1",
  "amazon.titan-embed-text-v1",
  "mistral.mistral-7b-instruct-v0:2",
  "mistral.mixtral-8x7b-instruct-v0:1",
  "bedrock/us-west-2/mistral.mixtral-8x7b-instruct-v0:1",
  "bedrock/us-west-2/mistral.mistral-7b-instruct",
  "anthropic.claude-3-sonnet-20240229-v1:0",
  "anthropic.claude-3-haiku-20240307-v1:0",
  "anthropic.claude-v1",
  "bedrock/us-east-1/anthropic.claude-v1",
  "bedrock/us-west-2/anthropic.claude-v1",
  "bedrock/ap-northeast-1/anthropic.claude-v1",
  "bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v1",
  "bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v1",
  "bedrock/eu-central-1/anthropic.claude-v1",
  "bedrock/eu-central-1/1-month-commitment/anthropic.claude-v1",
  "bedrock/eu-central-1/6-month-commitment/anthropic.claude-v1",
  "bedrock/us-east-1/1-month-commitment/anthropic.claude-v1",
  "bedrock/us-east-1/6-month-commitment/anthropic.claude-v1",
  "bedrock/us-west-2/1-month-commitment/anthropic.claude-v1",
  "bedrock/us-west-2/6-month-commitment/anthropic.claude-v1",
  "anthropic.claude-v2",
  "bedrock/us-east-1/anthropic.claude-v2",
  "bedrock/us-west-2/anthropic.claude-v2",
  "bedrock/ap-northeast-1/anthropic.claude-v2",
  "bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v2",
  "bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v2",
  "bedrock/eu-central-1/anthropic.claude-v2",
  "bedrock/eu-central-1/1-month-commitment/anthropic.claude-v2",
  "bedrock/eu-central-1/6-month-commitment/anthropic.claude-v2",
  "bedrock/us-east-1/1-month-commitment/anthropic.claude-v2",
  "bedrock/us-east-1/6-month-commitment/anthropic.claude-v2",
  "bedrock/us-west-2/1-month-commitment/anthropic.claude-v2",
  "bedrock/us-west-2/6-month-commitment/anthropic.claude-v2",
  "anthropic.claude-v2:1",
  "bedrock/us-east-1/anthropic.claude-v2:1",
  "bedrock/us-west-2/anthropic.claude-v2:1",
  "bedrock/ap-northeast-1/anthropic.claude-v2:1",
  "bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v2:1",
  "bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v2:1",
  "bedrock/eu-central-1/anthropic.claude-v2:1",
  "bedrock/eu-central-1/1-month-commitment/anthropic.claude-v2:1",
  "bedrock/eu-central-1/6-month-commitment/anthropic.claude-v2:1",
  "bedrock/us-east-1/1-month-commitment/anthropic.claude-v2:1",
  "bedrock/us-east-1/6-month-commitment/anthropic.claude-v2:1",
  "bedrock/us-west-2/1-month-commitment/anthropic.claude-v2:1",
  "bedrock/us-west-2/6-month-commitment/anthropic.claude-v2:1",
  "anthropic.claude-instant-v1",
  "bedrock/us-east-1/anthropic.claude-instant-v1",
  "bedrock/us-east-1/1-month-commitment/anthropic.claude-instant-v1",
  "bedrock/us-east-1/6-month-commitment/anthropic.claude-instant-v1",
  "bedrock/us-west-2/1-month-commitment/anthropic.claude-instant-v1",
  "bedrock/us-west-2/6-month-commitment/anthropic.claude-instant-v1",
  "bedrock/us-west-2/anthropic.claude-instant-v1",
  "bedrock/ap-northeast-1/anthropic.claude-instant-v1",
  "bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-instant-v1",
  "bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-instant-v1",
  "bedrock/eu-central-1/anthropic.claude-instant-v1",
  "bedrock/eu-central-1/1-month-commitment/anthropic.claude-instant-v1",
  "bedrock/eu-central-1/6-month-commitment/anthropic.claude-instant-v1",
  "cohere.command-text-v14",
  "bedrock/*/1-month-commitment/cohere.command-text-v14",
  "bedrock/*/6-month-commitment/cohere.command-text-v14",
  "cohere.command-light-text-v14",
  "bedrock/*/1-month-commitment/cohere.command-light-text-v14",
  "bedrock/*/6-month-commitment/cohere.command-light-text-v14",
  "cohere.embed-english-v3",
  "cohere.embed-multilingual-v3",
  "meta.llama2-13b-chat-v1",
  "meta.llama2-70b-chat-v1",
  "512-x-512/50-steps/stability.stable-diffusion-xl-v0",
  "512-x-512/max-steps/stability.stable-diffusion-xl-v0",
  "max-x-max/50-steps/stability.stable-diffusion-xl-v0",
  "max-x-max/max-steps/stability.stable-diffusion-xl-v0",
  "1024-x-1024/50-steps/stability.stable-diffusion-xl-v1",
  "1024-x-1024/max-steps/stability.stable-diffusion-xl-v1",
  "deepinfra/lizpreciatior/lzlv_70b_fp16_hf",
  "deepinfra/Gryphe/MythoMax-L2-13b",
  "deepinfra/mistralai/Mistral-7B-Instruct-v0.1",
  "deepinfra/meta-llama/Llama-2-70b-chat-hf",
  "deepinfra/cognitivecomputations/dolphin-2.6-mixtral-8x7b",
  "deepinfra/codellama/CodeLlama-34b-Instruct-hf",
  "deepinfra/deepinfra/mixtral",
  "deepinfra/Phind/Phind-CodeLlama-34B-v2",
  "deepinfra/mistralai/Mixtral-8x7B-Instruct-v0.1",
  "deepinfra/deepinfra/airoboros-70b",
  "deepinfra/01-ai/Yi-34B-Chat",
  "deepinfra/01-ai/Yi-6B-200K",
  "deepinfra/jondurbin/airoboros-l2-70b-gpt4-1.4.1",
  "deepinfra/meta-llama/Llama-2-13b-chat-hf",
  "deepinfra/amazon/MistralLite",
  "deepinfra/meta-llama/Llama-2-7b-chat-hf",
  "deepinfra/01-ai/Yi-34B-200K",
  "deepinfra/openchat/openchat_3.5",
  "perplexity/codellama-34b-instruct",
  "perplexity/codellama-70b-instruct",
  "perplexity/pplx-7b-chat",
  "perplexity/pplx-70b-chat",
  "perplexity/pplx-7b-online",
  "perplexity/pplx-70b-online",
  "perplexity/llama-2-70b-chat",
  "perplexity/mistral-7b-instruct",
  "perplexity/mixtral-8x7b-instruct",
  "perplexity/sonar-small-chat",
  "perplexity/sonar-small-online",
  "perplexity/sonar-medium-chat",
  "perplexity/sonar-medium-online",
  "maritalk"
]

Twitter / LinkedIn details

No response

foragerr commented 6 months ago

This list needs an update? https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py#L431

foragerr commented 6 months ago

Also this entry in the model_list is broken, should be two separate lines.

"daanelson/flan-t5-large:ce962b3f6792a57074a601d3979db5839697add2e4e02696b3ced4c022d4767freplicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5",

Missing comma here? https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py#L351

foragerr commented 6 months ago

More generally, https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json has 308 entries, vs litellm.model_list returns 246 entries. Are 62 entries missing?

krrishdholakia commented 6 months ago

are you looking for the model cost map - litellm.model_cost?

https://github.com/BerriAI/litellm/blob/c35b4c9b80d9cd7d61bfa1120e48e30c295cc68c/litellm/__init__.py#L233

https://docs.litellm.ai/docs/completion/token_usage#7-model_cost

krrishdholakia commented 6 months ago

here's how model_list is initialized - let me know if you see any gaps in implementation: https://github.com/BerriAI/litellm/blob/c35b4c9b80d9cd7d61bfa1120e48e30c295cc68c/litellm/__init__.py#L431

foragerr commented 6 months ago

I'm looking for a full list of models supported by litellm. I assumed, perhaps mistakenly, that llmlite.model_list is supposed to provide that.

I'm happy to just use litellm.model_cost.keys() if you think that is the right direction.

foragerr commented 6 months ago

I had raised this PR: https://github.com/BerriAI/litellm/pull/2806 to add gemini models to model_list and to fix a missing comma elsewhere.

foragerr commented 6 months ago

Hey @krrishdholakia @ishaan-jaff, There are still some differences between litellm.model_cost.keys() and llmlite.model_list. Could you comment on which should be treated as the source of truth for models supported by litellm ?

krrishdholakia commented 6 months ago

Hey @foragerr litellm supports providers

For example - you can call any model on together ai through litellm.

The model list is just a list of specific popular models we're tracking for cost etc.

If you want to see which providers are supported by litellm, you can use litellm.provider_list

krrishdholakia commented 6 months ago

If you can share specific gaps, i can also investigate those

foragerr commented 6 months ago
import litellm

models_from_model_cost = litellm.model_cost.keys()
models_from_model_list = litellm.model_list

# print count of models commom in both model_cost and model_list
print(f"Common models: {len(set(models_from_model_cost) & set(models_from_model_list))}")

# print count of models unique to model_cost
print(f"Models in model_cost, but not model_list: {len(set(models_from_model_cost) - set(models_from_model_list))}")

# print count of models unique to model_list
print(f"Models in model_list, but not model_cost: {len(set(models_from_model_list) - set(models_from_model_cost))}")
Common models: 206
Models in model_cost, but not model_list: 102
Models in model_list, but not model_cost: 45
foragerr commented 6 months ago

Your comment about supporting providers rather than models makes sense.

Over in the OpenDevin repo, model is dynamically set in the call

response = litellm.completion(
    model=some_model, 
    messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}]
)

and it would be nice to have a list of allowed values for model, and they're leaning towards using list(set(litellm.model_list) | set(litellm.model_cost.keys()))