This PR fixes the "argument list too long" error in the text_generation.sh file when using large base64 encoded images by using temporary files to store the encoded data and the JSON payload. This resolves issues encountered when running the text_gen_multimodal_one_image_prompt and text_gen_multimodal_one_image_prompt_streaming examples.
This PR also makes the following additions and updates:
Adds two new multimodal vision examples to text_generation.sh:
text_gen_multimodal_two_image_prompt: Demonstrates using two images in a single prompt.
text_gen_multimodal_one_image_bounding_box_prompt: Shows how to generate bounding boxes for objects.
Updates the text_gen_multimodal_video_prompt example: The prompt is now more comprehensive and does a better job demonstrating Gemini 1.5's multimodal capabilities.
Motivation
These examples are being added so they can be included in the revamped vision documentation.
Type of change
Feature request
Checklist
[x] I have performed a self-review of my code.
[x] I have added detailed comments to my code where applicable.
[x] I have verified that my change does not break existing code.
[x] My PR is based on the latest changes of the main branch (if unsure, please run git pull --rebase upstream main).
[x] I am familiar with the Google Style Guide for the language I have coded in.
Description of the change
This PR fixes the "argument list too long" error in the
text_generation.sh
file when using large base64 encoded images by using temporary files to store the encoded data and the JSON payload. This resolves issues encountered when running thetext_gen_multimodal_one_image_prompt
andtext_gen_multimodal_one_image_prompt_streaming
examples.This PR also makes the following additions and updates:
Adds two new multimodal vision examples to
text_generation.sh
:text_gen_multimodal_two_image_prompt
: Demonstrates using two images in a single prompt.text_gen_multimodal_one_image_bounding_box_prompt
: Shows how to generate bounding boxes for objects.Updates the
text_gen_multimodal_video_prompt
example: The prompt is now more comprehensive and does a better job demonstrating Gemini 1.5's multimodal capabilities.Motivation
These examples are being added so they can be included in the revamped vision documentation.
Type of change
Feature request
Checklist
git pull --rebase upstream main
).