getomni-ai / zerox

PDF to Markdown with vision models
https://getomni.ai/ocr-demo
MIT License
6.58k stars 358 forks source link

[Error] Gemini models are not working #90

Open eugeek opened 2 weeks ago

eugeek commented 2 weeks ago

Hi, I'm trying to use gemini models for vertex (vertex_ai/gemini-1.5-flash, vertex_aigemini-1.5-flash-001, vertex_ai/gemini-1.5-flash-8b) and gemini/gpt-4o-mini, but it returns an error that this is not a vision model.

Errors

The provided model is not a vision model. Please provide a vision model.
     (Extra Info: {'model': 'gemini/gpt-4o-mini'})
The provided model is not a vision model. Please provide a vision model.
     (Extra Info: {'model': 'vertex_ai/gemini-1.5-flash-001'})

Source code:

from pyzerox import zerox
import os
import asyncio

qwargs = {}
custom_system_prompt = None

model = "gemini/gpt-4o-mini"
os.environ['GEMINI_API_KEY'] = os.getenv('GEMINI_API_KEY')

model = "vertex_ai/gemini-1.5-flash-001" 
os.environ['VERTEXAI_PROJECT']  = os.getenv('PROJECT_ID')
os.environ['VERTEXAI_LOCATION']  = os.getenv('PROJECT_REGION')

async def main():
    file_path = "./test.pdf"

    select_pages = None

    output_dir = "./results"
    result = await zerox(file_path=file_path, model=model, output_dir=output_dir,
                        custom_system_prompt=custom_system_prompt,select_pages=select_pages, **qwargs)

    print(result)

    return result

result = asyncio.run(main())

Thanks for help 👍