BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.58k stars 1.59k forks source link

GEMINI-1.5-PRO Main Day-1 support🧵 #2881

Closed jpshack-at-palomar closed 4 months ago

jpshack-at-palomar commented 7 months ago

What happened?

This is a placeholder for others who have this issue. There is likely no bug that needs to be fixed in LiteLLM but we won't know until more people have access to the gemini-1.5-pro API. There is some evidence that the model will actually be provided as gemini-1.5-pro-latest. Note that this issue DOES NOT relate to the Vertex API which is a different API and LiteLLM provider.

Source code:

from litellm import completion
import os

os.environ['GEMINI_API_KEY'] = "<<REDACTED>>"
response = completion(
    model="gemini/gemini-1.5-pro", 
    messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}]
)

See https://docs.litellm.ai/docs/providers/gemini#pre-requisites

Versions

google-generativeai                      0.4.1
litellm                                  1.34.28

Exception:

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Traceback (most recent call last):
  File ".env/lib/python3.12/site-packages/litellm/llms/gemini.py", line 216, in completion
    response = _model.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/google/generativeai/generative_models.py", line 232, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 566, in generate_content
    response = rpc(
               ^^^^
  File ".env/lib/python3.12/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File ".env/lib/python3.12/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File ".env/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/Volumes/boxy-01/code-archive/viewing/OpenDevin/.env/lib/python3.12/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.NotFound: 404 models/gemini-1.5-pro is not found for API version v1beta, or is not supported for GenerateContent. Call ListModels to see the list of available models and their supported methods.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".env/lib/python3.12/site-packages/litellm/main.py", line 1637, in completion
    model_response = gemini.completion(
                     ^^^^^^^^^^^^^^^^^^
  File ".env/lib/python3.12/site-packages/litellm/llms/gemini.py", line 222, in completion
    raise GeminiError(
litellm.llms.gemini.GeminiError: 404 models/gemini-1.5-pro is not found for API version v1beta, or is not supported for GenerateContent. Call ListModels to see the list of available models and their supported methods.

See the upstream issue here: https://github.com/google/generative-ai-python/issues/227

Relevant log output

No response

Twitter / LinkedIn details

No response

jpshack-at-palomar commented 7 months ago

See also: https://ai.google.dev/models/gemini

jpshack-at-palomar commented 7 months ago

To detemine whether you are seeing a limitation related to your key use this test:

If you have a key that is good for gemini pro 1.0 but not 1.5 you will get a response for this request:

gemini-1.0-pro-latest

curl \
  -H 'Content-Type: application/json' \
  -d '{"contents":[{"parts":[{"text":"Write a story about a magic backpack"}]}]}' \
  -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.0-pro-latest:generateContent?key=YOUR_API_KEY'

Response:

  {
    "candidates": [
        {
            "content": {
                "parts": [
                    {
                        "text": "In the quaint town of Willow Creek, amidst the rolling hills and whispering willow trees, there lived an ordinary boy named Ethan. Unbeknownst to him, an extraordinary adventure awaited him, hidden within the dusty attic of his grandmother's ancient home.\n\nAs Ethan rummaged through forgotten artifacts, his hands stumbled upon a worn-out leather backpack. Curiosity sparked within him as he unzipped the faded exterior, revealing a chaotic jumble of old schoolbooks, maps, and trinkets.\n\nUnbeknownst to Ethan, this was no ordinary backpack. It whispered ancient secrets and held a dormant power that was about to be awakened. As he delved deeper into the backpack's contents, his fingers brushed against a smooth, glowing orb.\n\nSuddenly, the backpack trembled, and a faint blue aura enveloped it. Ethan gasped in astonishment as the orb emitted a surge of energy that coursed through his body. In that instant, he felt an overwhelming connection to the world around him.\n\nThe backpack's newfound magic granted Ethan the ability to understand and interact with nature in an extraordinary way. He could hear the whispers of the wind, read the patterns of the stars, and communicate with animals.\n\nEmbarking on a thrilling odyssey, Ethan used the magic backpack to explore the hidden wonders of Willow Creek. He befriended the wise old owl that perched atop the town clock, learned the secret language of the squirrels, and discovered the enchanted waterfall tucked away in the deepest part of the forest.\n\nHowever, with great power came great responsibility. Ethan soon realized that the backpack's magic could be both a blessing and a curse. When the backpack's power flared out of control, it attracted the attention of malicious forces who sought to harness its energy for their own evil schemes.\n\nFacing danger at every turn, Ethan learned the true meaning of courage and perseverance. Guided by the wisdom of the backpack and his newfound friends, he outsmarted cunning sorcerers, outmaneuvered sly thieves, and ultimately defeated the forces of darkness.\n\nIn the end, Ethan emerged from his adventure as a wiser and more compassionate boy. The magic backpack, once a forgotten relic, became a lifelong companion and a symbol of the boundless possibilities that lie within us. And so, in the annals of Willow Creek, the tale of the boy with the magic backpack was passed down for generations to come, inspiring dreams and igniting the flames of imagination in every child who dared to believe in the extraordinary."
                    }
                ],
                "role": "model"
            },
            "finishReason": "STOP",
            "index": 0,
            "safetyRatings": [
                {
                    "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                    "probability": "NEGLIGIBLE"
                },
                {
                    "category": "HARM_CATEGORY_HATE_SPEECH",
                    "probability": "NEGLIGIBLE"
                },
                {
                    "category": "HARM_CATEGORY_HARASSMENT",
                    "probability": "NEGLIGIBLE"
                },
                {
                    "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                    "probability": "NEGLIGIBLE"
                }
            ]
        }
    ],
    "promptFeedback": {
        "safetyRatings": [
            {
                "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
                "probability": "NEGLIGIBLE"
            },
            {
                "category": "HARM_CATEGORY_HATE_SPEECH",
                "probability": "NEGLIGIBLE"
            },
            {
                "category": "HARM_CATEGORY_HARASSMENT",
                "probability": "NEGLIGIBLE"
            },
            {
                "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
                "probability": "NEGLIGIBLE"
            }
        ]
    }
}

gemini-1.5-pro-latest

curl \
  -H 'Content-Type: application/json' \
  -d '{"contents":[{"parts":[{"text":"Write a story about a magic backpack"}]}]}' \
  -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-latest:generateContent?key=YOUR_API_KEY'

Response:

  {
    "error": {
        "code": 404,
        "message": "models/gemini-1.5-pro-latest is not found for API version v1beta, or is not supported for GenerateContent. Call ListModels to see the list of available models and their supported methods.",
        "status": "NOT_FOUND"
    }
}
krrishdholakia commented 7 months ago

cc: @Manouchehri you might have some context on this - https://github.com/BerriAI/litellm/pull/2841

krrishdholakia commented 7 months ago

@jpshack-at-palomar Will also update our exception here

litellm.llms.gemini.GeminiError: 404 models/gemini-1.5-pro is not found for API version v1beta, or is not supported for GenerateContent. Call ListModels to see the list of available models and their supported methods.

To point to this ticket.

Thanks for this!

Manouchehri commented 7 months ago

I think you're right too.

That said, Gemini's naming conventions do seem to barely follow any conventions at times, so I wouldn't be shocked if they change naming when it goes into public preview. 😅

jameshiggie commented 7 months ago

got it working if you force the update of google-generativeai to 0.4.1. Since its set to 0.3.2 litellm reqs. code:

from litellm import completion
import os

os.environ['GEMINI_API_KEY'] = "<<REDACTED>>"
response = completion(
    model="gemini/gemini-1.5-pro-latest", 
    messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}]
)

LLM response formatted:

Saying Hi from LiteLLM: Code Options

Here are a few ways to write code for saying hi from LiteLLM, depending on your desired output and programming language:

Python:

print("Hi from LiteLLM!")

JavaScript:

console.log("Hi from LiteLLM!");

C++:

#include <iostream>

int main() {
  std::cout << "Hi from LiteLLM!" << std::endl;
  return 0;
}

Java:

public class Main {
  public static void main(String[] args) {
    System.out.println("Hi from LiteLLM!");
  }
}

Using a Function:

def say_hi():
  print("Hi from LiteLLM!")

say_hi()

This code defines a function called say_hi that prints the message and then calls the function to execute it.

Adding User Input:

name = input("What is your name? ")
print(f"Hi {name}, from LiteLLM!")

This code asks the user for their name and then includes it in the greeting.

Choosing the Right Code:

Remember to choose the code that best suits your specific needs and programming language.

jpshack-at-palomar commented 7 months ago

I just received access to gemini/gemini-1.5-pro-latest this evening and can confirm @jameshiggie's result with the following versions:

google-generativeai          0.5.0
litellm                      1.34.38

I will open a PR for a correction to https://docs.litellm.ai/docs/providers/gemini to show the model name as gemini/gemini-1.5-pro-latest.

krrishdholakia commented 7 months ago

testing on my end as well - i believe the new genai module also supports system instructions

krrishdholakia commented 7 months ago

pinning thread, for anyone else making issues on the new gemini updates. will be easier to consolidate discussion here.

jameshiggie commented 7 months ago

Is there a plan to get Async Streaming working for "google AI Studio - gemini" as well? In testing its great but for our use case we need streaming :)

krrishdholakia commented 7 months ago

Hey @jameshiggie this should already be working. Do you see an error?

Screenshot 2024-04-09 at 5 53 07 PM
jameshiggie commented 7 months ago

o nice! let me test now :), I didn't bother trying since I saw in the readme it wasn't supported :S image

jameshiggie commented 7 months ago

running the example case:

from litellm import acompletion
import asyncio, os, traceback

async def completion_call():
    try:
        print("test acompletion + streaming")
        messages=[{"role": "user","content": "write code for saying hi from LiteLLM"}]
        response = await acompletion(
            model="gemini/gemini-1.5-pro-latest", 
            messages=messages, 
            stream=True
        )
        print(f"response: {response}")
        async for chunk in response:
            print(chunk)
    except:
        print(f"error occurred: {traceback.format_exc()}")
        pass

await completion_call()

fails :(

output:

test acompletion + streaming

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

error occurred: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/litellm/llms/gemini.py", line 175, in async_streaming
    response = await _model.generate_content_async(
  File "/usr/local/lib/python3.10/dist-packages/google/generativeai/generative_models.py", line 263, in generate_content_async
    iterator = await self._async_client.stream_generate_content(
  File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry_async.py", line 223, in retry_wrapped_func
    return await retry_target(
  File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry_async.py", line 121, in retry_target
    return await asyncio.wait_for(
  File "/usr/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers_async.py", line 177, in error_remapped_callable
    raise TypeError("Unexpected type of call %s" % type(call))
TypeError: Unexpected type of call <class 'google.api_core.rest_streaming.ResponseIterator'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/litellm/main.py", line 317, in acompletion
    response = await init_response
  File "/usr/local/lib/python3.10/dist-packages/litellm/llms/gemini.py", line 192, in async_streaming
    raise GeminiError(status_code=500, message=str(e))
litellm.llms.gemini.GeminiError: Unexpected type of call <class 'google.api_core.rest_streaming.ResponseIterator'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<ipython-input-5-f5bdc6ae1aa1>", line 8, in completion_call
    response = await acompletion(
  File "/usr/local/lib/python3.10/dist-packages/litellm/utils.py", line 3418, in wrapper_async
    raise e
  File "/usr/local/lib/python3.10/dist-packages/litellm/utils.py", line 3250, in wrapper_async
    result = await original_function(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/litellm/main.py", line 330, in acompletion
    raise exception_type(
  File "/usr/local/lib/python3.10/dist-packages/litellm/utils.py", line 8533, in exception_type
    raise e
  File "/usr/local/lib/python3.10/dist-packages/litellm/utils.py", line 8501, in exception_type
    raise APIConnectionError(
litellm.exceptions.APIConnectionError: Unexpected type of call <class 'google.api_core.rest_streaming.ResponseIterator'>
krrishdholakia commented 7 months ago

ok - i'll work on repro'ing + push a fix @jameshiggie

jameshiggie commented 7 months ago

thanks! 🔥

jameshiggie commented 7 months ago

after leaving the example case. I tried it out in a dev version of our product and it works! :D Using same versions for both tho... :S

google-generativeai          0.4.1
litellm                      1.34.38

A little disappointed its very chunky streaming from google, i'm guessing due to safety screening. But easy fix and can deal with that with some stream buffering on our side. thanks for helping with all of this again :)

krrishdholakia commented 7 months ago

@jameshiggie so this is a no-op?

jameshiggie commented 6 months ago

the example case still fails in a fresh venv with the same error above, not sure why.

from litellm import acompletion
import asyncio, os, traceback

os.environ['GEMINI_API_KEY'] = "XYZ"

async def completion_call():
    try:
        print("test acompletion + streaming")
        messages=[{"role": "user","content": "write code for saying hi from LiteLLM"}]
        response = await acompletion(
            model="gemini/gemini-1.5-pro-latest", 
            messages=messages, 
            stream=True
        )
        print(f"response: {response}")
        async for chunk in response:
            print(chunk)
    except:
        print(f"error occurred: {traceback.format_exc()}")
        pass

await completion_call()

the chunking and yield is slightly diff between the two. It'll be good to have some example code like this for the litellm docs that people can run quickly to test

krrishdholakia commented 6 months ago

I'm seeing a similar issue - it works on ci/cd but throws the responseiterator issue on local. Not sure why.

I see a similar hang when i use the raw google sdk. So i wonder if it's something in my environment.

I'm able to get it working via curl though. Will investigate and aim to have a better solution here by tomorrow @jameshiggie

jameshiggie commented 6 months ago

ok thanks! 🏋️

CXwudi commented 6 months ago

Sorry that I made a new issue just for supporting the system message for Gemini 1.5 pro https://github.com/BerriAI/litellm/issues/2963.

Also I can confirm that the system message is supported by the playground UI in Google Ai Studio, but unfortunately I can't find the API documentation of how to input the system message

aleclarson commented 6 months ago

Is gemini/gemini-1.5-pro-latest still the recommended model name? I don't see it in the registry: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

Manouchehri commented 6 months ago

I'm using vertex_ai/gemini-1.5-pro-preview-0409.

krrishdholakia commented 6 months ago

@CXwudi system message is already supported for gemini - google ai studio - https://github.com/BerriAI/litellm/blob/6e934cb842f762830949312bc37760eb2d950b9e/litellm/llms/gemini.py#L146

@aleclarson gemini/.. is for calling via Google AI Studio (it's the one where google gives you an api key), and yep the way you're calling it looks fine.

vertex_ai/.. is for google's vertex ai implementation.

@aleclarson Curious - how do you use our model_prices json?

krrishdholakia commented 6 months ago

Pending items:

aleclarson commented 6 months ago

@krrishdholakia My PR which adds LiteLLM support to Aider needs it for various metadata (most importantly, which models are supported). See here: https://github.com/paul-gauthier/aider/pull/549/files#diff-da3f6418cba825fc2eac007d80f318784be5cf8f0f9a27433e2693338ca4c8b9R114

Manouchehri commented 6 months ago

@krrishdholakia You may want to merge #2964 before starting on Vertex AI system message.

krrishdholakia commented 6 months ago

@aleclarson got it.

The right way to check is provider. For eg.- any model on togetherai could be called via litellm - together_ai/<model-name>

This might not be in the map (which is used for tracking price and context window for popular models)

You can check which providers we support via - https://github.com/BerriAI/litellm/blob/cd834e9d52691ad74ba4205263f6afd38bb857b1/litellm/__init__.py#L466

aleclarson commented 6 months ago

@krrishdholakia Good to know. Looks like the model_cost dict is more what I'm looking for. Though I don't like how main is hard-coded. Shouldn't the version tag of the current LiteLLM installation be used instead? Then you could avoid hitting the network every time, since the local cache of it would be version-specific (and so the cache would be invalidated when LiteLLM is updated).

krrishdholakia commented 6 months ago

The purpose is to avoid needing to upgrade litellm, to get new models. Would welcome any improvement here.

Related issue: https://github.com/BerriAI/litellm/issues/411#issuecomment-1727789763

CXwudi commented 6 months ago

@CXwudi system message is already supported for gemini - google ai studio -

https://github.com/BerriAI/litellm/blob/6e934cb842f762830949312bc37760eb2d950b9e/litellm/llms/gemini.py#L146

The code is correct, but looks like we need to upgrade the google-generativeai dependency to like 0.5.0 to make the system message really pass through. Not sure if 0.4.1 is ok.

demux79 commented 6 months ago

I am using vertex_ai/gemini-1.5-pro-preview-0409 which should support function calls. However, using it with the proxy never returns a valid function call and crashes. It works well with other models. Any idea what I am doing wrong?

litellm | 17:11:27 - LiteLLM Router:DEBUG: router.py:1184 - TracebackTraceback (most recent call last): litellm | File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_ai.py", line 819, in async_completion litellm | for k, v in function_call.args.items(): litellm | ^^^^^^^^^^^^^^^^^^^^^^^^ litellm | AttributeError: 'NoneType' object has no attribute 'items'

CXwudi commented 6 months ago

For some reason, I am getting the message saying gemini-1.5-pro-latest model is not in the model_prices_and_context_window.json image

where I have following in the YAML config:

- model_name: gemini-1.5-pro-latest
  litellm_params:
    model: gemini/gemini-1.5-pro-latest
    api_key: os.environ/GEMINI_API_KEY
krrishdholakia commented 6 months ago

Just added. should be fixed now @CXwudi

This was caused b/c we run a check if the gemini model supports vision

krrishdholakia commented 6 months ago

@demux79 tested on my end with our function calling test - https://github.com/BerriAI/litellm/blob/7656bd1c3f52c9a35c1a5b531da23ffca06069b8/litellm/tests/test_amazing_vertex_completion.py#L533

Works fine. I suspect this is an issue with the endpoint returning a weird response (maybe None?). If you're able to repro with litellm.set_verbose = True and share the raw response, that would help

Screenshot 2024-04-16 at 6 54 22 PM
demux79 commented 6 months ago

@krrishdholakia Thanks. Indeed, Gemini returns None for my function call. With a simple function call it works.

router.py:517 - litellm.acompletion(model=vertex_ai/gemini-1.5-pro-preview-0409) Exception VertexAIException - 'NoneType' object has no attribute 'items'

GPT-4 and Opus handle my more complicated function call quite well. It seems Gemini just isn't quite up to the same standard then ;)


Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_ai.py", line 819, in async_completion
    for k, v in function_call.args.items():
                ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'items'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 317, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_ai.py", line 971, in async_completion
    raise VertexAIError(status_code=500, message=str(e))
litellm.llms.vertex_ai.VertexAIError: 'NoneType' object has no attribute 'items'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 1276, in async_function_with_retries
    response = await original_function(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 522, in _acompletion
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 506, in _acompletion
    response = await _response
               ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3433, in wrapper_async
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3265, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 338, in acompletion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 8573, in exception_type
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 7817, in exception_type
    raise APIError(
litellm.exceptions.APIError: VertexAIException - 'NoneType' object has no attribute 'items'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 3589, in chat_completion
    responses = await asyncio.gather(
                ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 412, in acompletion
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 408, in acompletion
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 1259, in async_function_with_fallbacks
    raise original_exception
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 1180, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 1370, in async_function_with_retries
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 1332, in async_function_with_retries
    response = await original_function(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 522, in _acompletion
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 427, in _acompletion
    deployment = await self.async_get_available_deployment(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 2566, in async_get_available_deployment
    return self.get_available_deployment(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/router.py", line 2678, in get_available_deployment
    rpm = healthy_deployments[0].get("litellm_params").get("rpm", None)
          ~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range```
CXwudi commented 6 months ago

Just added. should be fixed now @CXwudi

This was caused b/c we run a check if the gemini model supports vision

For some reason, it is still not fixed for me. The last working version I tried is v1.35.5. Also, it looks like the multiple gemini models are affected, so far I knew gemini-1.0-pro-latest is also affected

hi019 commented 6 months ago

@demux79 The error is due to a faulty if statement in LiteLLM, see https://github.com/BerriAI/litellm/issues/3097 and my PR, https://github.com/BerriAI/litellm/pull/3101

CXwudi commented 6 months ago

Just added. should be fixed now @CXwudi This was caused b/c we run a check if the gemini model supports vision

For some reason, it is still not fixed for me. The last working version I tried is v1.35.5. Also, it looks like the multiple gemini models are affected, so far I knew gemini-1.0-pro-latest is also affected

The problem is now solved with https://github.com/BerriAI/litellm/pull/3186, which is working in v1.35.17