Closed paul-gauthier closed 2 months ago
hey @paul-gauthier acknowledging this issue.
Will fix + add better testing on our end for this.
FWIW, it appears that 242 of the models in litellm.model_cost.keys()
are returning sane results.
95 models are returning the seemingly invalid {'keys_in_environment': False, 'missing_keys': []}
I used this script to enumerate them:
import litellm
litellm.suppress_debug_info = True
bad = set()
good = set()
models = litellm.model_cost.keys()
for model in models:
res = litellm.validate_environment(model)
missing_keys = res.get("missing_keys")
keys_in_environment = res.get("keys_in_environment")
if not keys_in_environment and not missing_keys:
bad.add(model)
else:
good.add(model)
print('num good', len(good))
print()
print('num bad', len(bad))
for model in sorted(bad):
print(model)
And got this output, listing all the bad models:
num good 242
num bad 95
anyscale/HuggingFaceH4/zephyr-7b-beta
anyscale/Mixtral-8x7B-Instruct-v0.1
anyscale/codellama/CodeLlama-34b-Instruct-hf
anyscale/meta-llama/Llama-2-13b-chat-hf
anyscale/meta-llama/Llama-2-70b-chat-hf
anyscale/meta-llama/Llama-2-7b-chat-hf
anyscale/mistralai/Mistral-7B-Instruct-v0.1
babbage-002
cloudflare/@cf/meta/llama-2-7b-chat-fp16
cloudflare/@cf/meta/llama-2-7b-chat-int8
cloudflare/@cf/mistral/mistral-7b-instruct-v0.1
cloudflare/@hf/thebloke/codellama-7b-instruct-awq
command-light
command-r
command-r-plus
davinci-002
deepinfra/01-ai/Yi-34B-200K
deepinfra/01-ai/Yi-34B-Chat
deepinfra/01-ai/Yi-6B-200K
deepinfra/Gryphe/MythoMax-L2-13b
deepinfra/Phind/Phind-CodeLlama-34B-v2
deepinfra/amazon/MistralLite
deepinfra/codellama/CodeLlama-34b-Instruct-hf
deepinfra/cognitivecomputations/dolphin-2.6-mixtral-8x7b
deepinfra/deepinfra/airoboros-70b
deepinfra/deepinfra/mixtral
deepinfra/jondurbin/airoboros-l2-70b-gpt4-1.4.1
deepinfra/lizpreciatior/lzlv_70b_fp16_hf
deepinfra/meta-llama/Llama-2-13b-chat-hf
deepinfra/meta-llama/Llama-2-70b-chat-hf
deepinfra/meta-llama/Llama-2-7b-chat-hf
deepinfra/mistralai/Mistral-7B-Instruct-v0.1
deepinfra/mistralai/Mixtral-8x7B-Instruct-v0.1
deepinfra/openchat/openchat_3.5
gemini-1.0-pro-vision
gemini-1.0-pro-vision-001
gemini-pro-vision
gemini/gemini-1.5-pro
gemini/gemini-1.5-pro-latest
gemini/gemini-pro
gemini/gemini-pro-vision
gpt-3.5-turbo-instruct
gpt-3.5-turbo-instruct-0914
groq/gemma-7b-it
groq/llama2-70b-4096
groq/llama3-70b-8192
groq/llama3-8b-8192
groq/mixtral-8x7b-32768
mistral/mistral-embed
mistral/mistral-large-2402
mistral/mistral-large-latest
mistral/mistral-medium
mistral/mistral-medium-2312
mistral/mistral-medium-latest
mistral/mistral-small
mistral/mistral-small-latest
mistral/mistral-tiny
mistral/open-mixtral-8x7b
palm/chat-bison
palm/chat-bison-001
palm/text-bison
palm/text-bison-001
palm/text-bison-safety-off
palm/text-bison-safety-recitation-off
perplexity/codellama-34b-instruct
perplexity/codellama-70b-instruct
perplexity/llama-2-70b-chat
perplexity/mistral-7b-instruct
perplexity/mixtral-8x7b-instruct
perplexity/pplx-70b-chat
perplexity/pplx-70b-online
perplexity/pplx-7b-chat
perplexity/pplx-7b-online
perplexity/sonar-medium-chat
perplexity/sonar-medium-online
perplexity/sonar-small-chat
perplexity/sonar-small-online
sagemaker/meta-textgeneration-llama-2-13b
sagemaker/meta-textgeneration-llama-2-13b-f
sagemaker/meta-textgeneration-llama-2-70b
sagemaker/meta-textgeneration-llama-2-70b-b-f
sagemaker/meta-textgeneration-llama-2-7b
sagemaker/meta-textgeneration-llama-2-7b-f
together-ai-20.1b-40b
together-ai-3.1b-7b
together-ai-40.1b-70b
together-ai-7.1b-20b
together-ai-up-to-3b
voyage/voyage-01
voyage/voyage-2
voyage/voyage-code-2
voyage/voyage-large-2
voyage/voyage-law-2
voyage/voyage-lite-01
voyage/voyage-lite-02-instruct
Any update here? Having the library validate and list the needed env variables is a pretty core feature. It really helps users as they try and connect to different providers.
hey @paul-gauthier missed this - my bad!
Let me work on this today.
Tracking missing providers here:
Providers
Models
I installed v1.35.35.dev1 which has these changes, and things seem better! But Cohere is still not returning keys.
model: command-r-plus
validate_environment: {'keys_in_environment': False, 'missing_keys': []}
My enumeration scrip uncovers a few other models which are in model_cost
but which validate_environment()
returns the seemingly invalid {'keys_in_environment': False, 'missing_keys': []}. You can see Cohere's models are in this list too.
num bad 12
babbage-002
command-light
command-r
command-r-plus
davinci-002
gpt-3.5-turbo-instruct
gpt-3.5-turbo-instruct-0914
together-ai-20.1b-40b
together-ai-3.1b-7b
together-ai-40.1b-70b
together-ai-7.1b-20b
together-ai-up-to-3b
Would it make sense to add the enumeration script as a test? Possibly with a block list if some entries of model_cost
aren't legit, or should they be pruned from model_cost
in that case?
Yup - i'm planning on adding the script you shared as the unit test. Fixing a stability issue right now.
Thanks for testing this! @paul-gauthier
What happened?
validate_environment()
works fine forgpt-3.5-turbo
andclaude-3-sonnet-20240229
:keys_in_environment = True
andmissing_keys = []
as expected.keys_in_environment = False
andmissing_keys
contains the list of env variables that need to be set.But for models like
gemini/gemini-1.5-pro-latest
orcommand-r-plus
it always returnskeys_in_environment = False
andmissing_keys = []
regardless of whether the needed API key variable is set in the environment or not. It always claims the keys aren't set. And it never returns the missing keys that need to be set.Below is a minimal script to show this. For each of the 4 models, it calls
validate_environment()
and thencompletion()
. First it does this with a properly setup environment, as you can see all thecompletion()
calls succeed. Then it erases the environment and does this again for all 4 models. This time you can see all thecompletion()
calls fail because there are no keys in the environment.But as noted above,
validate_environment()
doesn't ever work correctly forgemini/gemini-1.5-pro-latest
orcommand-r-plus
.Here is the script:
And here is the output:
Relevant log output
No response
Twitter / LinkedIn details
No response