Open asimplify opened 3 weeks ago
Hey there, @asimplify! I'm a bot here to help you with any bugs, questions, or contributions. Let's tackle this issue together while we wait for a human maintainer.
To resolve the GoogleGenerativeAIError: [400 Bad Request]
when using the gemini-pro-vision
model for multimodal generation, ensure that you are correctly setting the modelName
field to gemini-pro-vision
and that you are providing the required parameters correctly. Here is an example of how to instantiate and use the ChatGoogleGenerativeAI
class with the gemini-pro-vision
model:
import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HumanMessage } from "@langchain/core/messages";
// Multi-modal
const vision = new ChatGoogleGenerativeAI({
modelName: "gemini-pro-vision",
maxOutputTokens: 2048,
});
const image = fs.readFileSync("./hotdog.jpg").toString("base64");
const input2 = [
new HumanMessage({
content: [
{
type: "text",
text: "Describe the following image.",
},
{
type: "image_url",
image_url: `data:image/png;base64,${image}`,
},
],
}),
];
const res2 = await vision.invoke(input2);
console.log(res2);
/*
AIMessage {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
// Multi-modal streaming
const res3 = await vision.stream(input2);
for await (const chunk of res3) {
console.log(chunk);
}
/*
AIMessageChunk {
content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
name: 'model',
additional_kwargs: {}
}
AIMessageChunk {
content: ' brown color. The hot dog is in the center of the bun.',
name: 'model',
additional_kwargs: {}
}
*/
Ensure that the image is correctly encoded in base64 format and included in the query. If the error persists, double-check the API key and other parameters to ensure they are correctly set. Additionally, make sure you are using the correct field name modelName
instead of model
when initializing the ChatGoogleGenerativeAI
class [1][2][3].
To continue talking to Dosu, mention @dosu.
Correct, this is a result of a set of limitations of the gemini-pro-vision
model. It does not support chat mode, so only the first human message is processed, and subsequent attempts will generate the error shown.
While gemini-pro-vision
is still a valid model on the AI Studio / Google Generative AI platform, it isn't listed as being in production, while the Gemini 1.5 models (which don't have the same limitation) are in production.
Since I commented yesterday, Google has now officially deprecated Gemini 1.0 Pro Vision (see https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.0-pro-vision) in favor of Gemini 1.5. It is scheduled to be removed from the AI Studio API on July 12th 2024.
If you feel you still need Gemini 1.0 Pro Vision, you can switch to Vertex AI, although it is only scheduled to be supported until Feb 2025.
@jacoblee93 - Suggest closing this issue as not being a LangChain.js bug.
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
GoogleGenerativeAIError: [400 Bad Request] Add an image to use models/gemini-pro-vision, or switch your model to a text model. { status: 400, attemptNumber: 1, retriesLeft: 6 }
This is the error producing when using google-pro-vision model.
Description
I am using ChatGoogleGenerativeAI library for multimodel generation, that's working fine with gemini-1.5-pro and gemini-1.5 flash but when using gemini-pro-vision it produces an error even I provide image with required query.
I am following the steps as mentioned in documentation: https://js.langchain.com/v0.2/docs/integrations/chat/google_generativeai
System Info
Platform: Windows 10 Node version: 18.13.0 NPM version: 9.4.1