BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
14.07k stars 1.66k forks source link

[Bug]: error converting pydantic base model to json schema #6848

Closed alexkuzmik closed 12 hours ago

alexkuzmik commented 21 hours ago

What happened?

  1. If you try to call litellm.get_supported_openai_params("ollama/qwen2.5:3b") you'll see that the response_format parameter is supported. Docs bring an example of specifying it via the pydantic.BaseModel class (which is intuitive, since openai does the same). However, if you try to call a completion function, you'll get:

    TypeError: <class 'metric.HallucinationResponseFormat'> cannot be parametrized because it does not inherit from typing.Generic
  2. If you try to call litellm.get_supported_openai_params("groq/llama-3.1-70b-versatile") you'll also see, that response format is supported, however, using this parameter will lead to:

    litellm.exceptions.APIError: litellm.APIError: APIError: GroqException - You tried to pass a `BaseModel` class to `chat.completions.create()`; You must use `beta.chat.completions.parse()` instead

    This is already mentioned in another issue btw.

It would be nice to have a more robust tool for the structured outputs handling.

Relevant log output

No response

Twitter / LinkedIn details

No response

krrishdholakia commented 19 hours ago

@alexkuzmik the 2 issues are separate - the groq issue is addressed.

Your current issue seems to indicate a pydantic model that is non-standard. This is the specific code raising the error, https://github.com/BerriAI/litellm/blob/ddfe687b13e9f31db2fb2322887804e3d01dd467/litellm/utils.py#L5298

can you share a minimal code snippet to repro this issue?

alexkuzmik commented 19 hours ago

@krrishdholakia Sure

import litellm

import pydantic
class ResponseFormat(pydantic.BaseModel):
    x: str
    y: str

litellm.completion(
    model='ollama/qwen2.5:3be',
    messages=[
        {
            "content": 'Return some x and y values in {"x": ..., "y": ...} json format',
            "role": "user",
        },
    ],
    response_format=ResponseFormat
)
krrishdholakia commented 18 hours ago

thanks - able to repro