google-gemini / generative-ai-python

The official Python library for the Google Gemini API
https://pypi.org/project/google-generativeai/
Apache License 2.0
1.39k stars 270 forks source link

Support for Video (MP4) with Gemini Pro 1.5 #332

Closed schnee closed 3 months ago

schnee commented 4 months ago

Description of the bug:

When processing video/mp4 mime-type assets, exception is thrown. Exception is not thrown when processing image/png (e.g.) assets.

import google.generativeai as genai
from google.ai.generativelanguage import Part, Blob
from google.ai.generativelanguage import GenerationConfig
import os
from dotenv import load_dotenv

api_key = os.getenv("GEMINI_API_KEY")

genai.configure(api_key=api_key)

model = genai.GenerativeModel('models/gemini-1.5-pro-latest')

prompt_part = Part()
prompt_part.text = """describe this video in a few words."""

image_fn = "/path/to/short_video.mp4" # 374KB
with open(image_fn, "rb") as asset_data:
    asset = Blob()
    asset.mime_type = "video/mp4"
    asset.data = asset_data.read()

asset_part = Part()
asset_part.inline_data = asset

response = model.generate_content([prompt_part, asset_part],
                                  generation_config=GenerationConfig(temperature=0.0,),
                                 )

Actual vs expected behavior:

actual behavior:

following exception is thrown:

 /Users/schnee/projects/pmg/github/bw-impossible/creative_kitchen/repl.py
Traceback (most recent call last):
  File "/Users/schnee/projects/pmg/github/bw-impossible/creative_kitchen/repl.py", line 25, in <module>
    response = model.generate_content([prompt_part, asset_part],
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/generativeai/generative_models.py", line 262, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 791, in generate_content
    response = rpc(
               ^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnee/Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InternalServerError: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

expected behavior: gemini-pro-1.5-latest consumes the prompt and video file.

Any other information you'd like to share?

Note, over in vertexai land, the model is gemini-pro-1.5-preview-0409 and this does appear to process videos. I'm not sure if my issues is a model capability or not; that model does not appear to be available to this SDK.

The video I'm using is quite short and small: < 10 seconds and ~384KB. It is also h.264 encoded, which appears to be acceptable by Gemini.

singhniraj08 commented 4 months ago

@schnee, Thank you reporting this issue. "500 An internal error has occurred" error looks like an intermittent error and should work now. This repository is for issues related to Gemini Python SDK bugs or improvements. For issues for feature requests related to Gemini API, we would suggest you to use "Send Feedback" option in Gemini docs. Ref: Screenshot below.

image

schnee commented 4 months ago

@singhniraj08 Thank you. For being intermittent, it happens to me every time, including 2 minutes ago. Can you share the MP4 you used to validate that it works for you?

I have ported over to the VertexAI Python SDK and am able to use it to process the same test case video.

MarkDaoust commented 4 months ago

We only announced video support in the API+SDK today, it's probably wasn't active when you tried it. Try again?

schnee commented 4 months ago

Thank you @MarkDaoust

I updated to google-ai-generativelanguage-0.6.3 google-generativeai-0.5.3 and am getting another error now.

But you answered my base question: did the SDK support video, and the answer was no. This issue can be closed. And I'll research the below stack trace and possibly open another ticket.

Traceback (most recent call last):
  File "/Users//projects//github/bw-impossible/creative_kitchen/mre.py", line 25, in <module>
    response = model.generate_content([prompt_part, asset_part],
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/generativeai/generative_models.py", line 262, in generate_content
    response = self._client.generate_content(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 812, in generate_content
    response = rpc(
               ^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
             ^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users//Library/Caches/pypoetry/virtualenvs/bw-impossible-_xsmYHYG-py3.11/lib/python3.11/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 * GenerateContentRequest.generation_config.response_schema.type: field predicate failed: $ != TYPE_UNSPECIFIED
MarkDaoust commented 4 months ago

Video works. Try the cookbook video tutorial: https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Video.ipynb

GenerateContentRequest.generation_config.response_schema.type: field predicate failed: $ != TYPE_UNSPECIFIED

That error is about the generation_config.response_schema parameter. What did you pass it? I can't see in that paste.

schnee commented 4 months ago

@MarkDaoust - I literally updated the packages and reran the code from the top.

I'm good with someone (me?) closing this ticket, and I appreciate the pointers to make the code run.

Bikatr7 commented 4 months ago

@MarkDaoust - I literally updated the packages and reran the code from the top.

I'm good with someone (me?) closing this ticket, and I appreciate the pointers to make the code run.

You can close it yourself if you resolved the issue.

joja16 commented 3 months ago

I encountered the error: ValueError('The response.parts quick accessor only works for a single candidate, but none were returned. Check the response.prompt_feedback to see if the prompt was blocked.')

Has anyone encountered the same? The video I uploaded was over 5 minutes long, here is my code:

safety_settings=[
    {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_NONE",
    },
    {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_NONE",
    },
    ]
video_file = genai.upload_file(path=filepath)

    while video_file.state.name == "PROCESSING":
        print('Waiting for video to be processed.')
        time.sleep(10)
        video_file_infos = genai.get_file(video_file.name)
        if video_file_infos.state.name == "ACTIVE":
            response = model.generate_content(['what is this?', video_file], safety_settings=safety_settings,
                                              request_options={"timeout": 600})

            genai.delete_file(video_file_infos.name)
            return response
MarkDaoust commented 3 months ago

Try printing the response object that you got back.

joja16 commented 3 months ago

Thanks for your reply. This is the response object I got back: image

And I changed to async code: image

joja16 commented 3 months ago

I found out the reason. In my video which contains footage of creating a doll character, it started with creating body parts and clothes. So Google has determined that the video violates sexually. Is there any way to fix it?

github-actions[bot] commented 3 months ago

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 28 days. Please post a new issue if you need further assistance. Thanks!