[Feature]: Support vertex ai 'response_schema' param

BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.19k stars 1.42k forks source link

[Feature]: Support vertex ai 'response_schema' param #4473

Closed krrishdholakia closed 2 months ago

krrishdholakia commented 2 months ago

The Feature

Similar to the gemini specific implementation - https://docs.litellm.ai/docs/providers/vertex#json-schema

"for models that support it directly forward it along. For models that don't we would append it to the prompt.

and then validate the schema on return and if it fails treat it as any other failure with retries

we could do the validation, but having it in LiteLLM so falbacks work is better." - @lolsborn

v0 coverage:

gemini
claude 3-5 sonnet
azure gpt-4o

Motivation, pitch

user request, to enable fallbacks/retries to work for json mode calls

Twitter / LinkedIn details

No response

krrishdholakia commented 2 months ago

e.g. of how gemini response schema works

from litellm import completion 
import json 

## SETUP ENVIRONMENT
# !gcloud auth application-default login - run this to add vertex credentials to your env

messages = [
    {
        "role": "user",
        "content": "List 5 popular cookie recipes."
    }
]

response_schema = {
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "recipe_name": {
                    "type": "string",
                },
            },
            "required": ["recipe_name"],
        },
    }

completion(
    model="vertex_ai_beta/gemini-1.5-pro", 
    messages=messages, 
    response_format={"type": "json_object", "response_schema": response_schema} # 👈 KEY CHANGE
    )

print(json.loads(completion.choices[0].message.content))

krrishdholakia commented 2 months ago

v0:

if model not supported, add as user message
validate the output for fallback/retries on router
support claude 35 sonnet on vertex ai

Bonus:

azure gpt-4o support

lolsborn commented 2 months ago

@andrazk is likely to be the person to follow up on this and do the testing on our end. He was already implementing some of this validation.

krrishdholakia commented 2 months ago

Rough thoughts re: implementation:

If response_schema is set, litellm will validate the response against the schema, and raise a JSONSchemaValidationError if the response does not match the schema.

JSONSchemaValidationError inherits from openai.APIError

Access the raw response with e.raw_response

from litellm import completion, JSONSchemaValidationError
try: 
    completion(
    model="vertex_ai_beta/gemini-1.5-pro", 
    messages=messages, 
    response_format={"type": "json_object", "response_schema": response_schema} # 👈 KEY CHANGE
    )
except JSONSchemaValidationError as e: 
    print("Raw Response: {}".format(e.raw_response))
    raise e

krrishdholakia commented 2 months ago

anthropic seems really poor at executing this

krrishdholakia commented 2 months ago

scoped to just vertex ai - this is now live https://docs.litellm.ai/docs/providers/vertex#json-schema