google-gemini / cookbook

Examples and guides for using the Gemini API.
https://ai.google.dev/gemini-api/docs
Apache License 2.0
4.13k stars 553 forks source link

Gemini handle the pdf file? #158

Open helai78 opened 1 month ago

helai78 commented 1 month ago

Description of the feature request:

https://ai.google.dev/gemini-api/docs/prompting_with_media?lang=python based on the above link, it seems not to work on the pdf file? is my understanding right?

What problem are you trying to solve with this feature?

No response

Any other information you'd like to share?

No response

singhniraj08 commented 1 month ago

@helai78, As shown in documentation, Supported text formats are noted here. Gemini API won't support PDF file, as application/pdf MIME type is not supported yet. Alternatively, you can use AI Studio to work with pdf files using Gemini. Thank you!

helai78 commented 1 month ago

Hello, @singhniraj08 Thank you for you clarfication.

AI Studio you mentioned is Vertex AI Gemini API which can handle pdf file. this Vertex AI is part of google could, which means 90 days free for me. is my undersanding correct?

could you tell me any alternatives to handle the pdf files with the use of gemini 1.5 pro?

thanks in adcance.

anusonawane commented 11 hours ago

Hello @helai78 , Currently, there's no direct support for uploading PDF files, but we can work around this by converting the PDF to images and extracting text separately. https://github.com/google-gemini/cookbook/blob/main/quickstarts/PDF_Files.ipynb