bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
702 stars 180 forks source link

Add support for Ollama, Palm, Claude-2, Cohere, Replicate, Llama2 CodeLlama (100+LLMs) [LiteLLM] #160

Open ishaan-jaff opened 8 months ago

ishaan-jaff commented 8 months ago

This PR adds support for the above mentioned LLMs using LiteLLM https://github.com/BerriAI/litellm/ LiteLLM is a lightweight package to simplify LLM API calls - use any llm as a drop in replacement for gpt-3.5-turbo.

Example

from litellm import completion

## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages)

# anthropic call
response = completion(model="claude-instant-1", messages=messages)
ishaan-jaff commented 8 months ago

@loubnabnl @Muennighoff @infinitylogesh can you take a look at this PR when possible ?

I believe LiteLLM makes it easier to benchmark LLMs - would love your feedback if not

loubnabnl commented 7 months ago

Thanks for the PR, what do you think @Muennighoff ?

ishaan-jaff commented 7 months ago

@Muennighoff @loubnabnl updated based on feedback - thanks