Open ab-kotecha opened 6 months ago
Added Python version details.
Updated environment details for VertexAI Workbench.
Hi @ab-kotecha,
Thank you for raising the issue.
As a quick fix, can you please enable “add_sleep_after_page" as True and then re-run this block:
text_metadata_df, image_metadata_df = get_document_metadata(
multimodal_model, # we are passing gemini 1.0 pro vision model
pdf_folder_path,
image_save_dir="images",
image_description_prompt=image_description_prompt,
embedding_size=1408,
add_sleep_after_page = True, # Uncomment this if you are running into API quota issues
sleep_time_after_page = 5,
# generation_config = # see next cell
# safety_settings = # see next cell
)
If you can also share few more details about your workflow, it would really help me zero down the issues and I can help and support you better:
Thanks for your suggestion Lavi. I have kept the settings for add_sleep_after_page and sleep_time_after_page by uncommenting them earlier for both Vertex AI Workbench and Google Colab platform. With and without those settings active for all.
How many documents are you passing and how many pages does that have overall? -> I am using the same documents as in the demo, the Alphabet 10K report, no change. Running the script sequentially without a single change. Does your documents contain a mix of images and text? -> Yes. The document is the same which is part of the demo story. Are you getting '_MultiThreadedRendezvous' error right at the start of the processing, or it happens after few pages? -> It appears as the first image goes for extract processing. Processing the file: --------------------------------- data/google-10k-sample-part1.pdf
Processing page: 1 Processing page: 2 Extracting image from page: 2, saved as: images/google-10k-sample-part1.pdf_image_1_0_11.jpeg ERROR Occurs as soon as the above line is printed on Jupyter.
Strangely enough, I did the same thing in the Skill Boost environment lab, that issue was not there. Then I copied the notebook content from lab and pasted the file in my personal account, the issue appeared again. I was able to reproduce the same error on Vertex AI Workbench and Google Colab, with or without the add_sleep_after_page and sleep_time_after_page configs.
I tried again, the following is the cell output for the code below:
# Specify the PDF folder with multiple PDF
# pdf_folder_path = "/content/data/" # if running in Google Colab/Colab Enterprise
pdf_folder_path = "data/" # if running in Vertex AI Workbench.
# Specify the image description prompt. Change it
image_description_prompt = """Explain what is going on in the image.
If it's a table, extract all elements of the table.
If it's a graph, explain the findings in the graph.
Do not include any numbers that are not mentioned in the image.
"""
# Extract text and image metadata from the PDF document
text_metadata_df, image_metadata_df = get_document_metadata(
multimodal_model, # we are passing gemini 1.0 pro vision model
pdf_folder_path,
image_save_dir="images",
image_description_prompt=image_description_prompt,
embedding_size=1408,
add_sleep_after_page = True, # Uncomment this if you are running into API quota issues
sleep_time_after_page = 5,
# generation_config = # see next cell
# safety_settings = # see next cell
)
print("\n\n --- Completed processing. ---")
Processing the file: --------------------------------- data/google-10k-sample-part2.pdf
Processing page: 1
Extracting image from page: 1, saved as: images/google-10k-sample-part2.pdf_image_0_0_6.jpeg
{
"name": "InvalidArgument",
"message": "400 Request contains an invalid argument.",
"stack": "---------------------------------------------------------------------------
_MultiThreadedRendezvous Traceback (most recent call last)
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/google/api_core/grpc_helpers.py:173, in _wrap_stream_errors.<locals>.error_remapped_callable(*args, **kwargs)
172 prefetch_first = getattr(callable_, \"_prefetch_first_result_\", True)
--> 173 return _StreamingResponseIterator(
174 result, prefetch_first_result=prefetch_first
175 )
176 except grpc.RpcError as exc:
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/google/api_core/grpc_helpers.py:95, in _StreamingResponseIterator.__init__(self, wrapped, prefetch_first_result)
94 if prefetch_first_result:
---> 95 self._stored_first_result = next(self._wrapped)
96 except TypeError:
97 # It is possible the wrapped method isn't an iterable (a grpc.Call
98 # for instance). If this happens don't store the first result.
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/grpc/_channel.py:540, in _Rendezvous.__next__(self)
539 def __next__(self):
--> 540 return self._next()
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/grpc/_channel.py:966, in _MultiThreadedRendezvous._next(self)
965 elif self._state.code is not None:
--> 966 raise self
_MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
\tstatus = StatusCode.INVALID_ARGUMENT
\tdetails = \"Request contains an invalid argument.\"
\tdebug_error_string = \"UNKNOWN:Error received from peer ipv6:%5B2404:6800:4009:815::200a%5D:443 {created_time:\"2024-03-01T11:58:50.247534+05:30\", grpc_status:3, grpc_message:\"Request contains an invalid argument.\"}\"
>
The above exception was the direct cause of the following exception:
InvalidArgument Traceback (most recent call last)
Cell In[9], line 14
7 image_description_prompt = \"\"\"Explain what is going on in the image.
8 If it's a table, extract all elements of the table.
9 If it's a graph, explain the findings in the graph.
10 Do not include any numbers that are not mentioned in the image.
11 \"\"\"
13 # Extract text and image metadata from the PDF document
---> 14 text_metadata_df, image_metadata_df = get_document_metadata(
15 multimodal_model, # we are passing gemini 1.0 pro vision model
16 pdf_folder_path,
17 image_save_dir=\"images\",
18 image_description_prompt=image_description_prompt,
19 embedding_size=1408,
20 add_sleep_after_page = True, # Uncomment this if you are running into API quota issues
21 sleep_time_after_page = 5,
22 # generation_config = # see next cell
23 # safety_settings = # see next cell
24 )
26 print(\"\
\
--- Completed processing. ---\")
File ~/stage5-ip/playground/read/utils/intro_multimodal_rag_utils.py:545, in get_document_metadata(generative_multimodal_model, pdf_folder_path, image_save_dir, image_description_prompt, embedding_size, generation_config, safety_settings, add_sleep_after_page, sleep_time_after_page)
537 image_for_gemini, image_name = get_image_for_gemini(
538 doc, image, image_no, image_save_dir, file_name, page_num
539 )
541 print(
542 f\"Extracting image from page: {page_num + 1}, saved as: {image_name}\"
543 )
--> 545 response = get_gemini_response(
546 generative_multimodal_model,
547 model_input=[image_description_prompt, image_for_gemini],
548 generation_config=generation_config,
549 safety_settings=safety_settings,
550 stream=True,
551 )
553 image_embedding = get_image_embedding_from_multimodal_embedding_model(
554 image_uri=image_name,
555 embedding_size=embedding_size,
556 )
558 image_description_text_embedding = (
559 get_text_embedding_from_text_embedding_model(text=response)
560 )
File ~/stage5-ip/playground/read/utils/intro_multimodal_rag_utils.py:363, in get_gemini_response(generative_multimodal_model, model_input, stream, generation_config, safety_settings)
355 response = generative_multimodal_model.generate_content(
356 model_input,
357 generation_config=generation_config,
358 stream=stream,
359 safety_settings=safety_settings,
360 )
361 response_list = []
--> 363 for chunk in response:
364 try:
365 response_list.append(chunk.text)
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/vertexai/generative_models/_generative_models.py:505, in _GenerativeModel._generate_content_streaming(self, contents, generation_config, safety_settings, tools)
482 \"\"\"Generates content.
483
484 Args:
(...)
497 A stream of GenerationResponse objects
498 \"\"\"
499 request = self._prepare_request(
500 contents=contents,
501 generation_config=generation_config,
502 safety_settings=safety_settings,
503 tools=tools,
504 )
--> 505 response_stream = self._prediction_client.stream_generate_content(
506 request=request
507 )
508 for chunk in response_stream:
509 yield self._parse_response(chunk)
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/google/cloud/aiplatform_v1beta1/services/prediction_service/client.py:2207, in PredictionServiceClient.stream_generate_content(self, request, model, contents, retry, timeout, metadata)
2204 self._validate_universe_domain()
2206 # Send the request.
-> 2207 response = rpc(
2208 request,
2209 retry=retry,
2210 timeout=timeout,
2211 metadata=metadata,
2212 )
2214 # Done; return the response.
2215 return response
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/google/api_core/gapic_v1/method.py:131, in _GapicCallable.__call__(self, timeout, retry, compression, *args, **kwargs)
128 if self._compression is not None:
129 kwargs[\"compression\"] = compression
--> 131 return wrapped_func(*args, **kwargs)
File ~/stage5-ip/.venv_stage5/lib/python3.11/site-packages/google/api_core/grpc_helpers.py:177, in _wrap_stream_errors.<locals>.error_remapped_callable(*args, **kwargs)
173 return _StreamingResponseIterator(
174 result, prefetch_first_result=prefetch_first
175 )
176 except grpc.RpcError as exc:
--> 177 raise exceptions.from_grpc_error(exc) from exc
InvalidArgument: 400 Request contains an invalid argument."
}
I tried this again today in different environments. I am getting the same issue. Not sure if I need to perform any operation to the GCloud API/Limits?
HI @ab-kotecha, It seems like a localized issue at your end, possibly something to do with your Access or Quota. I tested the notebook again with my personal GCP account and the notebook seems to be working fine. Are you using personal GCP account (and on free $300 credits) or corporate account?
Hi Lavi,
Thanks for your email. I tested this on a corporate account.
Which quote/limit do you think I should use? I checked all the quota on the Service Limits, and none of them are being hit. I am not sure if there is any additional API that I need to enable?
Best, Abhishek
On Fri, 8 Mar 2024 at 11:21, Lavi Nigam @.***> wrote:
HI @ab-kotecha https://github.com/ab-kotecha, It seems like a localized issue at your end, possibly something to do with your Access or Quota. I tested the notebook again with my personal GCP account and the notebook seems to be working fine. Are you using personal GCP account (and on free $300 credits) or corporate account?
— Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/generative-ai/issues/427#issuecomment-1985079416, or unsubscribe https://github.com/notifications/unsubscribe-auth/APQIJDKXEJWTJJMO4TRX6QDYXFGVLAVCNFSM6AAAAABD5LX2HKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVGA3TSNBRGY . You are receiving this because you were mentioned.Message ID: @.***>
I have a similar bug. I'm trying to do a multimodal prompt with a video. I'm able to fetch the video to Python and display it, but Gemini says it can't access it.
I'm using Quiklabs accounts
https://www.cloudskillsboost.google/paths/183/course_templates/981/labs/489761
prompt = """
What is shown in this video?
Where should I go to see it?
What are the top 5 places in the world that look like this?
"""
video = Part.from_uri(
uri="gs://github-repo/img/gemini/multimodality_usecases_overview/mediterraneansea.mp4",
mime_type="video/mp4",
)
contents = [prompt, video]
responses = multimodal_model.generate_content(contents, stream=True)
print("-------Prompt--------")
print_multimodal_prompt(contents)
print("\n-------Response--------")
for response in responses:
print(response.text, end="")
---------------------------------------------------------------------------
_MultiThreadedRendezvous Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:155, in _wrap_stream_errors.<locals>.error_remapped_callable(*args, **kwargs)
154 prefetch_first = getattr(callable_, "_prefetch_first_result_", True)
--> 155 return _StreamingResponseIterator(
156 result, prefetch_first_result=prefetch_first
157 )
158 except grpc.RpcError as exc:
File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:81, in _StreamingResponseIterator.__init__(self, wrapped, prefetch_first_result)
80 if prefetch_first_result:
---> 81 self._stored_first_result = next(self._wrapped)
82 except TypeError:
83 # It is possible the wrapped method isn't an iterable (a grpc.Call
84 # for instance). If this happens don't store the first result.
File /opt/conda/lib/python3.10/site-packages/grpc/_channel.py:543, in _Rendezvous.__next__(self)
542 def __next__(self):
--> 543 return self._next()
File /opt/conda/lib/python3.10/site-packages/grpc/_channel.py:969, in _MultiThreadedRendezvous._next(self)
968 elif self._state.code is not None:
--> 969 raise self
_MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.PERMISSION_DENIED
details = "Permission denied while accessing input file. Learn more about providing the appropriate credentials: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal."
debug_error_string = "UNKNOWN:Error received from peer ipv4:173.194.193.95:443 {created_time:"2024-07-14T04:37:19.233624331+00:00", grpc_status:7, grpc_message:"Permission denied while accessing input file. Learn more about providing the appropriate credentials: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal."}"
>
The above exception was the direct cause of the following exception:
PermissionDenied Traceback (most recent call last)
Cell In[12], line 18
15 print_multimodal_prompt(contents)
17 print("\n-------Response--------")
---> 18 for response in responses:
19 print(response.text, end="")
File ~/.local/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:689, in _GenerativeModel._generate_content_streaming(self, contents, generation_config, safety_settings, tools, tool_config)
664 """Generates content.
665
666 Args:
(...)
680 A stream of GenerationResponse objects
681 """
682 request = self._prepare_request(
683 contents=contents,
684 generation_config=generation_config,
(...)
687 tool_config=tool_config,
688 )
--> 689 response_stream = self._prediction_client.stream_generate_content(
690 request=request
691 )
692 for chunk in response_stream:
693 yield self._parse_response(chunk)
File ~/.local/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/prediction_service/client.py:2412, in PredictionServiceClient.stream_generate_content(self, request, model, contents, retry, timeout, metadata)
2409 self._validate_universe_domain()
2411 # Send the request.
-> 2412 response = rpc(
2413 request,
2414 retry=retry,
2415 timeout=timeout,
2416 metadata=metadata,
2417 )
2419 # Done; return the response.
2420 return response
File /opt/conda/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py:113, in _GapicCallable.__call__(self, timeout, retry, *args, **kwargs)
110 metadata.extend(self._metadata)
111 kwargs["metadata"] = metadata
--> 113 return wrapped_func(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:159, in _wrap_stream_errors.<locals>.error_remapped_callable(*args, **kwargs)
155 return _StreamingResponseIterator(
156 result, prefetch_first_result=prefetch_first
157 )
158 except grpc.RpcError as exc:
--> 159 raise exceptions.from_grpc_error(exc) from exc
PermissionDenied: 403 Permission denied while accessing input file. Learn more about providing the appropriate credentials: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal.
Fixed my bug: It appears gemini-1.0-pro-vision no longer supports video input. I changed the model gemini-1.5-flash and then it started working
Contact Details
abhishek@datavtar.com
File Name
gemini/getting-started/intro_gemini_python.ipynb
What happened?
Summary
Encounter an
InvalidArgument
error when executing a content generation request using Vertex AI's API in a custom processing workflow for PDF documents. The error occurs within theget_gemini_response
function, disrupting the extraction and processing of image and text metadata.Steps to Reproduce
get_document_metadata
function with a valid PDF document, specifying parameters for image description generation.InvalidArgument
error during the execution ofget_gemini_response
, specifically when callinggenerative_multimodal_model.generate_content
.Expected Behavior
The expected behavior is successful generation of content descriptions for images extracted from PDF documents without encountering an
InvalidArgument
error.Actual Behavior
The process fails, triggering an
_MultiThreadedRendezvous
that leads to anInvalidArgument
error. The traceback indicates an issue with the content generation request to Vertex AI's API.Environment
Additional Context
Possible Causes and Solutions
generate_content
API call.Relevant log output
Code of Conduct