[Bug]: Vertex gemini-pro-vision error using LiteLLM SDK

rambihari commented 2 months ago

What happened?

A bug happened! I'm developing an endpoint using the LiteLLM SDK and it's working fine for me with all the models I've attempted from Google's Vertex Model Garden except gemini-pro-vision. The response JSON I'm getting returned is just this which isn't descriptive at all

{
    "detail": {
        "Error": "'image'"
    }
}

Looking further in my application logs LiteLLM is logging out a more detailed error message which I'm assuming is causing the above error response to be returned.

Exception occured - model info for model=gemini-1.0-pro-vision does not have 'output_cost_per_character'-pricing
model_info={'key': 'gemini-1.0-pro-vision', 'max_tokens': 2048, 'max_input_tokens': 16384, 'max_output_tokens': 2048, 'input_cost_per_token': 2.5e-07, 'cache_creation_input_token_cost': None, 'cache_read_input_token_cost': None, 'input_cost_per_character': None, 'input_cost_per_token_above_128k_tokens': None, 'output_cost_per_token': 5e-07, 'output_cost_per_character': None, 'output_cost_per_token_above_128k_tokens': None, 'output_cost_per_character_above_128k_tokens': None, 'output_vector_size': None, 'litellm_provider': 'vertex_ai-vision-models', 'mode': 'chat', 'supported_openai_params': ['temperature', 'top_p', 'max_tokens', 'stream', 'tools', 'tool_choice', 'response_format', 'n', 'stop', 'extra_headers'], 'supports_system_messages': None, 'supports_response_schema': None, 'supports_vision': True, 'supports_function_calling': True, 'supports_assistant_prefill': False}
Traceback (most recent call last):
  File "C:\Users\rraman\AppData\Local\Programs\Python\Python311\Lib\site-packages\litellm\litellm_core_utils\llm_cost_calc\google.py", line 154, in cost_per_character
    assert (
AssertionError: model info for model=gemini-1.0-pro-vision does not have 'output_cost_per_character'-pricing

Seems that for gemini-pro-vision, model_info doesn't have output_cost_per_character and that's the unit used to calculate cost for this model and thus causes an error response to be returned.

Relevant log output

litellm.litellm_core_utils.llm_cost_calc.google.cost_per_character():                 Defaulting to (cost_per_token * 4) calculation for completion_cost
Exception occured - model info for model=gemini-1.0-pro-vision does not have 'output_cost_per_character'-pricing
model_info={'key': 'gemini-1.0-pro-vision', 'max_tokens': 2048, 'max_input_tokens': 16384, 'max_output_tokens': 2048, 'input_cost_per_token': 2.5e-07, 'cache_creation_input_token_cost': None, 'cache_read_input_token_cost': None, 'input_cost_per_character': None, 'input_cost_per_token_above_128k_tokens': None, 'output_cost_per_token': 5e-07, 'output_cost_per_character': None, 'output_cost_per_token_above_128k_tokens': None, 'output_cost_per_character_above_128k_tokens': None, 'output_vector_size': None, 'litellm_provider': 'vertex_ai-vision-models', 'mode': 'chat', 'supported_openai_params': ['temperature', 'top_p', 'max_tokens', 'stream', 'tools', 'tool_choice', 'response_format', 'n', 'stop', 'extra_headers'], 'supports_system_messages': None, 'supports_response_schema': None, 'supports_vision': True, 'supports_function_calling': True, 'supports_assistant_prefill': False}
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/llm_cost_calc/google.py", line 155, in cost_per_character
    "output_cost_per_character" in model_info
AssertionError: model info for model=gemini-1.0-pro-vision does not have 'output_cost_per_character'-pricing
model_info={'key': 'gemini-1.0-pro-vision', 'max_tokens': 2048, 'max_input_tokens': 16384, 'max_output_tokens': 2048, 'input_cost_per_token': 2.5e-07, 'cache_creation_input_token_cost': None, 'cache_read_input_token_cost': None, 'input_cost_per_character': None, 'input_cost_per_token_above_128k_tokens': None, 'output_cost_per_token': 5e-07, 'output_cost_per_character': None, 'output_cost_per_token_above_128k_tokens': None, 'output_cost_per_character_above_128k_tokens': None, 'output_vector_size': None, 'litellm_provider': 'vertex_ai-vision-models', 'mode': 'chat', 'supported_openai_params': ['temperature', 'top_p', 'max_tokens', 'stream', 'tools', 'tool_choice', 'response_format', 'n', 'stop', 'extra_headers'], 'supports_system_messages': None, 'supports_response_schema': None, 'supports_vision': True, 'supports_function_calling': True, 'supports_assistant_prefill': False}

Twitter / LinkedIn details

No response

krrishdholakia commented 2 months ago

Hi @rambihari can you share a minimal script to repro the issue. Cost tracking is done in a separate thread, and shouldn't be causing the call to fail.

rambihari commented 2 months ago

Hey Krish sure I've attached the script, hopefully you're able to reproduce what I'm seeing on my end.

Thanks, Ram

On Wed, Sep 18, 2024 at 10:57 AM Krish Dholakia @.***> wrote:

Hi @rambihari https://github.com/rambihari can you share a minimal script to repro the issue. Cost tracking is done in a separate thread, and shouldn't be causing the call to fail.

— Reply to this email directly, view it on GitHub https://github.com/BerriAI/litellm/issues/5768#issuecomment-2359086222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJD5WK3EFHSHLBEDIEKA5DDZXG5KJAVCNFSM6AAAAABOOAJ5J6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJZGA4DMMRSGI . You are receiving this because you were mentioned.Message ID: @.***>

krrishdholakia commented 2 months ago

Hi @rambihari i don't see a script

rambihari commented 2 months ago

Attached here directly.

litellm_completions.zip

krrishdholakia commented 2 months ago

Thanks @rambihari

krrishdholakia commented 2 months ago

    # Current support for Vertex Anthropic models is limited to certain regions
    if "claude-3-opus" in inputs["model"]:
        litellm.vertex_location = "us-east5"
    elif "claude-3" in inputs["model"]:
        litellm.vertex_location = "us-central1"
    else:
        litellm.vertex_location = GCP_REGION

@rambihari you don't need to do this. you can just pass in the vertex_location as a completion arg

completion(...,vertex_location=GCP_REGION)

krrishdholakia commented 2 months ago

Unable to repro @rambihari

Got:

ModelResponse(id='chatcmpl-38c3d539-cc17-42b7-8c10-9a26273f1d78', choices=[Choices(finish_reason='stop', index=0, message=Message(content=' The image shows a pontoon boat on the Charles River in Boston, Massachusetts. The boat is docked near the Longfellow Bridge, and the Zakim Bridge is in the distance. There are a few other boats on the river, and the skyline of Boston is in the background. The water is calm and there are a few clouds in the sky.', role='assistant', tool_calls=None, function_call=None))], created=1726945261, model='gemini-pro-vision', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=69, prompt_tokens=261, total_tokens=330, completion_tokens_details=None), vertex_ai_grounding_metadata=[], vertex_ai_safety_results=[[{'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'probabilityScore': 0.010681152, 'severity': 'HARM_SEVERITY_NEGLIGIBLE', 'severityScore': 0.06640625}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'probabilityScore': 0.059326172, 'severity': 'HARM_SEVERITY_NEGLIGIBLE', 'severityScore': 0.08886719}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'probabilityScore': 0.040771484, 'severity': 'HARM_SEVERITY_NEGLIGIBLE', 'severityScore': 0.03466797}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'probabilityScore': 0.045410156, 'severity': 'HARM_SEVERITY_NEGLIGIBLE', 'severityScore': 0.056640625}]], vertex_ai_citation_metadata=[])

Tested with:

    response = get_litellm_completion(
        inputs={
            "model": "gemini-pro-vision",
            "prompt": "describe this image",
            "images": ["gs://cloud-samples-data/generative-ai/image/boats.jpeg"],
            "temperature": 0.8,
            "top_p": 1,
            "n": 1,
            "max_tokens": 256,
            "presence_penalty": None,
            "frequency_penalty": None,
            "stream": False,
            "userID": "my-test-user",
            "source": "my-test-source",
        }
    )

on the latest version of litellm - v1.47.0.

Please feel free to reopen / create a new issue if the issue persists, with clear steps to reproduce the error and a complete stacktrace.

BerriAI / litellm