Can not put multiple images to request `instance` when using PredictionServiceClient to get caption of image

dbackspace commented 9 months ago

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

The issue you're having must be related to a file in this repository. We are unable to provide assistance for issues unrelated to samples in this repository.

Please include as much information as possible:

In which file did you encounter the issue?

Did you change the file? If so, how?

Describe the issue

I applied this request to get caption of image (https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/image-captioning). But I see in the params of request, instance is JsonArray, so I can put many images json to it. But when I try to put it, it has notify: "REASON":"io.grpc.StatusRuntimeException","STACK":"com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

minherz commented 9 months ago

Hello @dbackspace according to your description you are asking assistance in using REST API of the Vertex AI. I am afraid you posted it in the wrong place. This repository is intended for code samples that are published in Google documentation. If your question relates to a particular code sample hosted in this repository can you kindly point to the location of the code sample? Otherwise, I think that the better place to ask your question would be Vertex AI Platform community forum.

There are a few examples that you might find useful. They also reference client libraries in 3 languages that should simplify calling Vertex AI API.

Regarding the specific error, I cannot tell you much because you omitted important information about what you actually submit. The elements in the array are supposed to be objects with a key image where each one of them has to have either bytesBase64Encoded or gcsUri and mimeType fields. Can you please validate that your submit correct JSON?

As I mentioned we don't provide support for Google APIs in these repos. I am closing this issue for now. If you feel there is an error in our documentation, please use "Send Feedback" button at the bottom of that documentation page.

dbackspace commented 9 months ago

Hello, When I send an image request, I followed your sample JSON in google documentation, and it works normally. But when I tried to request multiple images in a prediction request, like below instance request, it notified log that INVALID ARGUMENT I provided in issue question.


{
 "instances" : [
  {
   "image" : {
    "bytesBase64Encoded": "xxxxx"
   }
 },
 {
  "image" : {
   "bytesBase64Encoded": "yyyyy"
  }
 },
 {
  "image" : {
   "bytesBase64Encoded": "zzzzz"
  }
 }
 ]
}

minherz commented 9 months ago

@dbackspace thank you for providing me a reference. I cannot be certain but looking at the protobuf of the request and comparing it with the info you provided I see that mimeType parameter for each of the instances is missing. The mimeType parameter should be a string containing one of the following mime types of the image:

image/jpeg
image/gif
image/png
image/webp
image/bmp
image/tiff
image/vnd.microsoft.icon

I suggest to set this parameter to the "right" value and try to run your request again. I see that documentation neither mentions the values nor provides shows this parameter in the example. I will open a bug about it.

dbackspace commented 9 months ago

@minherz Thank you for your detailed answer. But, I mean that I can not request predict for multiple images once request. I know what your mean. But when I saw in VertexAI documentation, mimeType is optional parameter. But after few days, I found other solution. It's Gemini API (upgrade version of PalM). I can attach maximum 16 images once request. Thank you for your effort. Have a good day!

minherz commented 9 months ago

Thank you for the feedback. Indeed, Gemini's multimode supports sending multiple images in a more convenient way. However, Gemini is just a model while we discussed the use of API that utilizes the model. Regarding your words about

...when I saw in VertexAI documentation, mimeType is optional parameter

The documentation that you referenced says nothing whether this is optional or not. I checked the protobuf declaration and I do not see it as optional because it expects certain values to be passed as a string. I filled a bug to fix the documentation so it will be more explicit regarding this parameter and a few others.

minherz commented 9 months ago

@dbackspace for a closure, I confirmed that Vertex AI Predict API does not support sending multiple images despite that the type of the instances field implies so. Documentation will be soon updated to capture this and a couple of other constraints.

dbackspace commented 8 months ago

Okay, thank you lots about your effort and your detailed answer. Have a good day ~~

GoogleCloudPlatform / java-docs-samples