Open fwermus opened 7 months ago
If you run runTextAndImages()
using the script example below updating YOUR_PROJECT_ID
do you get the same error?
const genAI = new GeminiApp({
region: 'us-central1',
project_id: 'YOUR_PROJECT_ID',
});
function fileToGenerativePart(id) {
const file = DriveApp.getFileById(id);
const imageBlob = file.getBlob();
const base64EncodedImage = Utilities.base64Encode(imageBlob.getBytes())
return {
inlineData: {
data: base64EncodedImage,
mimeType: file.getMimeType()
},
};
}
async function runTextAndImages() {
// For text-and-images input (multimodal), use the gemini-pro-vision model
const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" });
const prompt = "What's different between these pictures?";
const imageParts = [
fileToGenerativePart("1LXeJgNhlpnpS0RBfil6Ybx7QRvfqwvEh"),
fileToGenerativePart("1OFV88Zf5esi-Mtuap4iQyoCVeYlvIeqU"),
];
const result = await model.generateContent([prompt, ...imageParts]);
const response = await result.response;
const text = response.text();
console.log(text);
}
It fails with a pdf file and api key, but It runs with pdf file and project id and region and It does work if using vertex ai IDE.
const genAI = new GeminiApp("API_KEY");
It is a requiment to use API KEY instead of PROJECT ID and REGION
function fileToGenerativePart(id) {
const file = DriveApp.getFileById(id);
const imageBlob = file.getBlob();
const base64EncodedImage = Utilities.base64Encode(imageBlob.getBytes())
return {
inlineData: {
data: base64EncodedImage,
mimeType: file.getMimeType()
},
};
}
async function runTextAndImages() {
// For text-and-images input (multimodal), use the gemini-pro-vision model
const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" });
const prompt = "What does the pdf file says?";
const imageParts = [
fileToGenerativePart("1WtZquqcIFinWl_xQ9ZQFe3Cn7S1ihDj-"),
];
const result = await model.generateContent([prompt, ...imageParts]);
const response = await result.response;
const text = response.text();
console.log(text);
}
throws with api key and pdf file
Request to Gemini failed with response code 400 - {
"error": {
"code": 400,
"message": "Add an image to use models/gemini-pro-vision, or switch your model to a text model.",
"status": "INVALID_ARGUMENT"
}
}
To my understanding, It accepts media files
It gets confusing with the Google AI Studio and Vertex AI capabilities. The different model versions add some complexity. The ability to inline pdf's is a preview feature in Gemini 1.5 Pro:
const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" });
to:
const model = genAI.getGenerativeModel({ model: "gemini-1.5-pro-preview-0409" });
You can read more about an Overview of multimodal models
I am running in appscriot. If I run it using location and project id runs fine. But, If I change to api key I got
Request to Gemini failed with response code 400 - { "error": { "code": 400, "message": "Add an image to use models/gemini-pro-vision, or switch your model to a text model.", "status": "INVALID_ARGUMENT" } }
I am using