googleapis / python-aiplatform

A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Apache License 2.0
636 stars 345 forks source link

list index out of range full_response.candidates[0] #4203

Closed panla closed 1 month ago

panla commented 2 months ago

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

If you are still having issues, please be sure to include as much information as possible:

Environment details

Steps to reproduce

  1. use Model gemini-1.5-pro-001
  2. ask model question with video, and this video has 400 seconds, and 47MB
  3. when api response, appear error list index out of range at response_message = full_response.candidates[0].content,

Code example

no need

Stack trace

  File "/home/project/apps/modules/ai_vertex.py", line 280, in get_responses
    async for response in responses:
  File "/usr/local/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py", line 1324, in async_generator
    response_message = full_response.candidates[0].content
                       ~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

Ark-kun commented 2 months ago

Code example no need

Can you please provide a code example? I think it would help me help you.

panla commented 2 months ago

@Ark-kun this is error of python-aiplatform, not user

https://github.com/googleapis/python-aiplatform/blob/v1.61.0/vertexai/generative_models/_generative_models.py

at line 1256

response_message = full_response.candidates[0].content

need check full_response.candidates is None or is []

Ark-kun commented 2 months ago

need check full_response.candidates is None or is [] We already do that. See line 1114 https://github.com/googleapis/python-aiplatform/blob/40fb5c4f2d03bc0ca7e17cf91b99148454ab3e11/vertexai/generative_models/_generative_models.py#L1114

This is why I'm asking to see your code.

P.S. It's likely your response is blocked by safety or something similar and the response has no candidates. In Python, when you try accessing items in an empty list, you get error.

panla commented 2 months ago

@Ark-kun

There is no special code. The console will also report an error. you input a video, such as 15 minutes, On the console or use code, then an error will occur,

If the value of the detection object has already been done, that's great, but now v1.63.0 still reports this error

When input a longer video, the first response, full_response.candidates is [], Therefore, if the candidates are not determined, will raise IndexError

It is stated in the model introduction that it can support one hour of video. But in reality, it did not reach

Ark-kun commented 2 months ago

Hello.

Can you please include the failing code that calls the content generation method? Can you also include the full stack trace?

Providing this information can help us help you.

panla commented 1 month ago
import asyncio
import base64
import traceback
from typing import Optional, List, AsyncIterable

from google.api_core.exceptions import InvalidArgument, PermissionDenied, InternalServerError, ServiceUnavailable
from vertexai.preview.generative_models import GenerationConfig, GenerativeModel
from vertexai.preview.generative_models import GenerationResponse, Candidate, Content, Part
from vertexai.preview.generative_models import ChatSession

from loguru import logger
from pydantic import BaseModel, Field

class FileConst:
    class Category:
        Image = ('image/png', 'image/jpeg')

        Audio = ('audio/mp3', 'audio/wav')

        Video = (
            'video/mov', 'video/mpeg', 'video/mp4', 'video/mpg', 'video/avi', 'video/wmv', 'video/mpegps', 'video/flv'
        )

class ContentParser(BaseModel):

    category: str = Field(..., description='text, image, video, audio')
    content: Optional[str] = Field(None)

    fileData: Optional[str] = Field(None, description='base64')
    mime: Optional[str] = Field(None)

    role: Optional[str] = Field('user')

class AIParser(BaseModel):

    contents: List[ContentParser] = Field(..., description='')

    temperature: Optional[float] = Field(1)
    topK: Optional[int] = Field(5)
    topP: Optional[float] = Field(0.8)
    maxOutputTokens: Optional[int] = Field(4096)

class GeminiBaseOp:

    MaxInput = 30720
    MaxOutput = 2048

    IS_VISION = False
    MODEL_NAME = ""
    MaxRequestNum = 0

    Categories = ['text']
    MaxImageNum = 0
    MaxImageSize = 0
    MaxAudioDuration = 0
    MaxAudioNum = 0
    MaxVideoDuration = 0
    MaxVideoNum = 0

    @staticmethod
    def get_model_base(
            model_name: str,
            temperature: Optional[float] = None, top_p: Optional[float] = None,
            top_k: Optional[int] = None, max_output: Optional[int] = None
    ):

        generation_config = GenerationConfig(
            temperature=temperature, top_p=top_p, top_k=top_k, max_output_tokens=max_output
        )

        return GenerativeModel(model_name=model_name, generation_config=generation_config)

    @classmethod
    def build_content(cls, contents: List[ContentParser]) -> Optional[List[Part]]:

        results = []

        for content in contents:

            if content.category not in cls.Categories:
                continue

            if content.category == 'text' and content.content:
                part = Part.from_text(text=content.content)

            elif content.category == 'video' and content.fileData and cls.MaxVideoNum > 0:

                try:
                    b_data = base64.b64decode(content.fileData)
                except Exception as exc:
                    raise Exception(f'base64.b64decode error')

                part = Part.from_data(data=b_data, mime_type=content.mime)

            else:
                continue

            results.append(part)

        return results

    @classmethod
    async def send_requests(
            cls,
            model: GenerativeModel, contents: List[Part], is_stream: bool,
            histories: Optional[List[Content]] = None
    ):

        chat_session = ChatSession(model=model, history=histories, response_validation=False)

        try:
            responses = await chat_session.send_message_async(content=contents, stream=is_stream)
        except InvalidArgument as exc:
            logger.error(f'chat, InvalidArgument, error: {exc}')
            raise Exception(f'chat, request Google error, argument error, 400')
        except PermissionDenied as exc:
            logger.error(f'chat, PermissionDenied, error: {exc}')
            raise Exception(f'chat, request Google error, error:PermissionDenied, 403')
        except InternalServerError as exc:
            logger.error(f'chat, InternalServerError, error: {exc}')
            raise Exception(f'chat, request Google error, error:InternalServerError, 500')
        except ServiceUnavailable as exc:
            logger.error(f'chat, ServiceUnavailable, error: {exc}')
            raise Exception(f'chat, request Google error, error:ServiceUnavailable, 503')
        except Exception as exc:
            logger.error(f'chat, send message error, {exc}')
            logger.error(f'chat, error type:{type(exc)}')
            logger.error(traceback.format_exc())
            raise Exception(f'chat, request Google error, error:{exc}')

        return responses

    @classmethod
    async def get_responses(cls, responses: AsyncIterable[GenerationResponse]):

        try:
            async for response in responses:
                logger.info(response)
        except IndexError as exc:
            logger.error(f'error:{exc}')
            logger.error(traceback.format_exc())

        except Exception as exc:
            logger.error(f'error:{exc}')
            logger.error(traceback.format_exc())

class GeminiV15ProOp(GeminiBaseOp):
    MaxInput = 2097152
    MaxOutput = 8192

    IS_VISION = True
    MODEL_NAME = 'gemini-1.5-pro-001'
    MaxRequestNum = 5

    Categories = ['text', 'image', 'audio', 'video']
    MaxImageNum = 3000
    MaxImageSize = 20 * 1024 * 1024
    MaxAudioDuration = 8 * 3600
    MaxAudioNum = 1
    MaxVideoDuration = 3000
    MaxVideoNum = 10

    @classmethod
    def get_model(cls, parser: AIParser):
        return cls.get_model_base(
            model_name=cls.MODEL_NAME, temperature=parser.temperature, top_p=parser.topP, top_k=parser.topK,
            max_output=min([cls.MaxOutput, parser.maxOutputTokens])
        )

async def main():
    with open('./files/2.mp4', 'rb') as f:
        file_data = base64.b64encode(f.read()).decode('utf-8')

    contents = [
        ContentParser(category='text', content='hello'),
        ContentParser(category='video', fileData=file_data, mime='video/mp4')
    ]

    parser = AIParser(contents=contents)

    model = GeminiV15ProOp.get_model(parser=parser)

    contents = GeminiV15ProOp.build_content(contents=parser.contents)

    responses = await GeminiV15ProOp.send_requests(
        model=model, contents=contents, is_stream=True
    )

    await GeminiV15ProOp.get_responses(responses=responses)

import os

import vertexai

try:
    vertexai.init(project='project id', location='us-central1')
except Exception as exc:
    print(exc)

asyncio.run(main())

return

2024-09-05 15:34:49.901 | INFO     | __main__:get_responses:151 - prompt_feedback {
  block_reason: PROHIBITED_CONTENT
}
usage_metadata {
  prompt_token_count: 118001
  total_token_count: 118001
}

2024-09-05 15:34:49.904 | ERROR    | __main__:get_responses:153 - error:list index out of range
2024-09-05 15:34:49.905 | ERROR    | __main__:get_responses:154 - Traceback (most recent call last):
  File "E:\work\soulmate_service\googleAPI\tmp\server.py", line 150, in get_responses
    async for response in responses:
  File "D:\opt\miniconda3\envs\google_api\Lib\site-packages\vertexai\generative_models\_generative_models.py", line 1332, in async_generator
    response_message = full_response.candidates[0].content
                       ~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
2.mp4 is a 47.5 MB video, 00:06:40 duration

If shorten this video a bit
it`s ok

@Ark-kun

Ark-kun commented 1 month ago

Thank you for providing the code. This makes it much easier to help you.

As I expected, the problem is that the response is blocked by the safety filters: block_reason: PROHIBITED_CONTENT.

Most users receive a proper actionable error message that shows the whole response and tells you why this likely happened (safety filters). Unfortunately, it looks like you explicitly disabled that by passing response_validation=False:

chat_session = ChatSession(model=model, history=histories, **response_validation=False**)

Here is what documentation says about disable_validation:

            response_validation: Whether to validate responses before adding
                them to chat history. By default, `send_message` will raise
                error if the request or response is blocked or if the response
                is incomplete due to going over the max token limit.
                If set to `False`, the chat session history will always
                accumulate the request and response messages even if the
                reponse if blocked or incomplete. This can result in an unusable
                chat session state.

The documentation says that you really should not disable the response validation. You now see why. With disabled validation you can get partial messages in history and no actionable errors.

I advice you to remove response_validation=False from your code.

Another thing I immediately noticed is the way you use ChatSession. You construct it every time from external history. I'm not sure I understand why you use ChatSession at all. If you want manual and lower-level model querying, just use GenerativeModel.generate_content. You do not need to construct ChatSession at all.

contents = histories + [Content(role=..., parts=[...])]
responses = await model.generate_content_async(contents=contents, stream=is_stream)