BerriAI / litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
https://docs.litellm.ai/docs/
Other
10.05k stars 1.12k forks source link

[Feature]: Add llama3 from deepinfra #4255

Closed agamm closed 1 week ago

agamm commented 1 week ago

The Feature

It's already working with their API: https://deepinfra.com/meta-llama/Meta-Llama-3-70B-Instruct

Motivation, pitch

There is only llama2, but not the vastly superior llama3.

Twitter / LinkedIn details

unzip_dev

ishaan-jaff commented 1 week ago

we already support this @agamm - added to docs here too: https://github.com/BerriAI/litellm/pull/4265

ishaan-jaff commented 1 week ago

Do this

litellm.completion(model="deepinfra/meta-llama/Meta-Llama-3-8B-Instruct", messages)
ishaan-jaff commented 1 week ago

@agamm any chance we can hop on a call ? I'd love to learn how how we can improve litellm for you.

I reached out to you on Linkedin if DMs work.

Sharing a link to my cal for your convenience: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

agamm commented 1 week ago

My bad, I used meta-llama/Meta-Llama-3-8B-Instruct instead of adding deepifra prefix at the start.

agamm commented 4 days ago

Hey guys, I'm still getting:

litellm.exceptions.BadRequestError: litellm.BadRequestError: DeepinfraException - Deepinfra doesn't support tool_choice={'type': 'function', 'function': {'name': 'QueryResult'}}. To drop unsupported deepinfra params from the call, set litellm.drop_params = True

for this code:

import instructor
from litellm import completion

litellm.set_verbose = True
client = instructor.from_litellm(completion)

messages = [
        {
            "role": "system",
            "content": "You take text as input and structure it as valid JSON output.",
        },
        {
            "role": "user",
            "content": SYSTEM_STRUCUTRE_ANSWER.format(answer=answer_raw),
        },
    ]

    return client.chat.completions.create(
        model="deepinfra/meta-llama/Meta-Llama-3-8B-Instruct",
        messages=messages,
        max_tokens=512,
        response_model=QueryResult,
        temperature=0.0,
    )

Dependency versions:

litellm = "^1.40.15"
instructor = "^1.3.3"

lock:

[[package]]
name = "litellm"
version = "1.40.20"
...
[[package]]
name = "instructor"
version = "1.3.3"