getomni-ai / zerox

PDF to Markdown with vision models
https://getomni.ai/ocr-demo
MIT License
6.58k stars 358 forks source link

Failed to process image Error #67

Closed PTson2207 closed 4 weeks ago

PTson2207 commented 1 month ago

hi everyone I am experiencing zerox and got the following error, hope everyone can guide me, here are my steps: git clone ... pip install -e .

create file test.py with model gemini: file_path = "./cs101.pdf" output_dir = "./output_test/cs101.md" result = asyncio.run(pyzerox.zerox(file_path=file_path, model=model, output_file_path=output_dir, custom_system_prompt=custom_system_prompt, **kwargs)) print(result)

I get the result: ERROR:root: Failed to process image Error: expected string or bytes-like object, got 'NoneType' ZeroxOutput(completion_time=15140.184, file_name='cs101', input_tokens=0, output_tokens=0, pages=[Page(content='', content_length=0, page=1)])

pradhyumna85 commented 1 month ago

duplicate of #49 , #58

PTson2207 commented 1 month ago

Sorry, but I know zerox only supports .pdf files. But the file type I uploaded is cs101.pdf with the link at: https://omni-demo-data.s3.amazonaws.com/test/cs101.pdf

ZisuWang commented 4 weeks ago

It seems that Google has changed their LLM endpoints. For example, to use gemini 1.5 pro, you should specify model as model = "gemini/gemini-1.5-pro-002" instead of model = "gemini/gemini-1.5-pro"