GoogleCloudPlatform / generative-ai

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview
Apache License 2.0
6.41k stars 1.68k forks source link

[Bug]: InvalidArgument: 400 Request contains an invalid argument. #684

Closed Julian-Cao closed 2 months ago

Julian-Cao commented 2 months ago

File Name

gemini/getting-started/intro_gemini_1_5_pro.ipynb

What happened?

A bug happened! InvalidArgument: 400 Request contains an invalid argument.

It seems that the entire Gemini model, including but not limited to 1.5, API, console, and AI Studio, has special restrictions and recognition for the text within the prompt.

I copied all the text from a PDF file and gave it to the AI, hoping it would help me create a summary and some basic analysis. However, I received the following error.

image

image

image

image

Relevant log output

{
    "name": "InvalidArgument",
    "message": "400 Request contains an invalid argument.",
    "stack": "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31m_InactiveRpcError\u001b[0m                         Traceback (most recent call last)\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:79\u001b[0m, in \u001b[0;36m_wrap_unary_errors.<locals>.error_remapped_callable\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m     78\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m---> 79\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mcallable_\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     80\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m grpc\u001b[38;5;241m.\u001b[39mRpcError \u001b[38;5;28;01mas\u001b[39;00m exc:\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/grpc/_channel.py:1160\u001b[0m, in \u001b[0;36m_UnaryUnaryMultiCallable.__call__\u001b[0;34m(self, request, timeout, metadata, credentials, wait_for_ready, compression)\u001b[0m\n\u001b[1;32m   1154\u001b[0m (\n\u001b[1;32m   1155\u001b[0m     state,\n\u001b[1;32m   1156\u001b[0m     call,\n\u001b[1;32m   1157\u001b[0m ) \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_blocking(\n\u001b[1;32m   1158\u001b[0m     request, timeout, metadata, credentials, wait_for_ready, compression\n\u001b[1;32m   1159\u001b[0m )\n\u001b[0;32m-> 1160\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_end_unary_response_blocking\u001b[49m\u001b[43m(\u001b[49m\u001b[43mstate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcall\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mFalse\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mNone\u001b[39;49;00m\u001b[43m)\u001b[49m\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/grpc/_channel.py:1003\u001b[0m, in \u001b[0;36m_end_unary_response_blocking\u001b[0;34m(state, call, with_call, deadline)\u001b[0m\n\u001b[1;32m   1002\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 1003\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m _InactiveRpcError(state)\n\n\u001b[0;31m_InactiveRpcError\u001b[0m: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = \"Request contains an invalid argument.\"\n\tdebug_error_string = \"UNKNOWN:Error received from peer ipv4:127.0.0.1:8800 {grpc_message:\"Request contains an invalid argument.\", grpc_status:3, created_time:\"2024-05-15T16:21:20.50847412+08:00\"}\"\n>\n\nThe above exception was the direct cause of the following exception:\n\n\u001b[0;31mInvalidArgument\u001b[0m                           Traceback (most recent call last)\nCell \u001b[0;32mIn[45], line 19\u001b[0m\n\u001b[1;32m     13\u001b[0m     file_content \u001b[38;5;241m+\u001b[39m\u001b[38;5;241m=\u001b[39m text\n\u001b[1;32m     15\u001b[0m prompt \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\"\"\u001b[39m\n\u001b[1;32m     16\u001b[0m \u001b[38;5;124mGenerate a sumamry ,an outline and some releated questions from the following content : \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mfile_content\u001b[38;5;132;01m}\u001b[39;00m\n\u001b[1;32m     17\u001b[0m \u001b[38;5;124m\"\"\"\u001b[39m\n\u001b[0;32m---> 19\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[43mmodel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgenerate_content\u001b[49m\u001b[43m(\u001b[49m\u001b[43m[\u001b[49m\u001b[43mprompt\u001b[49m\u001b[43m]\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m     20\u001b[0m \u001b[38;5;28mprint\u001b[39m(response\u001b[38;5;241m.\u001b[39mtext)\n\u001b[1;32m     21\u001b[0m \u001b[38;5;66;03m# for candidates in response:\u001b[39;00m\n\u001b[1;32m     22\u001b[0m \u001b[38;5;66;03m#     print(candidates.text, end=\"\", flush=True)\u001b[39;00m\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:407\u001b[0m, in \u001b[0;36m_GenerativeModel.generate_content\u001b[0;34m(self, contents, generation_config, safety_settings, tools, tool_config, stream)\u001b[0m\n\u001b[1;32m    399\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_generate_content_streaming(\n\u001b[1;32m    400\u001b[0m         contents\u001b[38;5;241m=\u001b[39mcontents,\n\u001b[1;32m    401\u001b[0m         generation_config\u001b[38;5;241m=\u001b[39mgeneration_config,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    404\u001b[0m         tool_config\u001b[38;5;241m=\u001b[39mtool_config,\n\u001b[1;32m    405\u001b[0m     )\n\u001b[1;32m    406\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 407\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_generate_content\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    408\u001b[0m \u001b[43m        \u001b[49m\u001b[43mcontents\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcontents\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    409\u001b[0m \u001b[43m        \u001b[49m\u001b[43mgeneration_config\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mgeneration_config\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    410\u001b[0m \u001b[43m        \u001b[49m\u001b[43msafety_settings\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msafety_settings\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    411\u001b[0m \u001b[43m        \u001b[49m\u001b[43mtools\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtools\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    412\u001b[0m \u001b[43m        \u001b[49m\u001b[43mtool_config\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtool_config\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    413\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/vertexai/generative_models/_generative_models.py:496\u001b[0m, in \u001b[0;36m_GenerativeModel._generate_content\u001b[0;34m(self, contents, generation_config, safety_settings, tools, tool_config)\u001b[0m\n\u001b[1;32m    471\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Generates content.\u001b[39;00m\n\u001b[1;32m    472\u001b[0m \n\u001b[1;32m    473\u001b[0m \u001b[38;5;124;03mArgs:\u001b[39;00m\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    487\u001b[0m \u001b[38;5;124;03m    A single GenerationResponse object\u001b[39;00m\n\u001b[1;32m    488\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m    489\u001b[0m request \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_prepare_request(\n\u001b[1;32m    490\u001b[0m     contents\u001b[38;5;241m=\u001b[39mcontents,\n\u001b[1;32m    491\u001b[0m     generation_config\u001b[38;5;241m=\u001b[39mgeneration_config,\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    494\u001b[0m     tool_config\u001b[38;5;241m=\u001b[39mtool_config,\n\u001b[1;32m    495\u001b[0m )\n\u001b[0;32m--> 496\u001b[0m gapic_response \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_prediction_client\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgenerate_content\u001b[49m\u001b[43m(\u001b[49m\u001b[43mrequest\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mrequest\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    497\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_parse_response(gapic_response)\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/google/cloud/aiplatform_v1beta1/services/prediction_service/client.py:2103\u001b[0m, in \u001b[0;36mPredictionServiceClient.generate_content\u001b[0;34m(self, request, model, contents, retry, timeout, metadata)\u001b[0m\n\u001b[1;32m   2100\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_validate_universe_domain()\n\u001b[1;32m   2102\u001b[0m \u001b[38;5;66;03m# Send the request.\u001b[39;00m\n\u001b[0;32m-> 2103\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[43mrpc\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m   2104\u001b[0m \u001b[43m    \u001b[49m\u001b[43mrequest\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2105\u001b[0m \u001b[43m    \u001b[49m\u001b[43mretry\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mretry\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2106\u001b[0m \u001b[43m    \u001b[49m\u001b[43mtimeout\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtimeout\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2107\u001b[0m \u001b[43m    \u001b[49m\u001b[43mmetadata\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmetadata\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2108\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   2110\u001b[0m \u001b[38;5;66;03m# Done; return the response.\u001b[39;00m\n\u001b[1;32m   2111\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m response\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py:131\u001b[0m, in \u001b[0;36m_GapicCallable.__call__\u001b[0;34m(self, timeout, retry, compression, *args, **kwargs)\u001b[0m\n\u001b[1;32m    128\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_compression \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m    129\u001b[0m     kwargs[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcompression\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m compression\n\u001b[0;32m--> 131\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mwrapped_func\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\nFile \u001b[0;32m~/.local/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:81\u001b[0m, in \u001b[0;36m_wrap_unary_errors.<locals>.error_remapped_callable\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m     79\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m callable_(\u001b[38;5;241m*\u001b[39margs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m     80\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m grpc\u001b[38;5;241m.\u001b[39mRpcError \u001b[38;5;28;01mas\u001b[39;00m exc:\n\u001b[0;32m---> 81\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m exceptions\u001b[38;5;241m.\u001b[39mfrom_grpc_error(exc) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mexc\u001b[39;00m\n\n\u001b[0;31mInvalidArgument\u001b[0m: 400 Request contains an invalid argument."
}

Code of Conduct

holtskinner commented 2 months ago

For PDF input, it's recommend to pass the PDF directly to Gemini 1.5. You can see examples for document processing in this notebook:

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/document-processing/document_processing.ipynb

Julian-Cao commented 2 months ago

For PDF input, it's recommend to pass the PDF directly to Gemini 1.5. You can see examples for document processing in this notebook:

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/document-processing/document_processing.ipynb

I think it is not related to the method you deal with PDF file content. Suppose that if it is just a raw string not from PDF files, you put it to Gemini (AI Studio /Console / API), and it throws out a 400 error telling you that the param is invalid.

So what param is invalid? The long string? It worked in GPT3.5 and Claude. It should be something weird here.

What is more, there are some limits from Gemini when you put pdf files using GCS uri like size and page caps. It is not always right to use the native method to deal with PDF content. Sometime you will need to parse the file content by yourself.