getomni-ai / zerox

PDF to Markdown with vision models
https://getomni.ai/ocr-demo
MIT License
6.74k stars 367 forks source link

If there are not only texts but also pictures in the PDF file I uploaded, can these pictures be recognized and generated as separate pictures? #109

Open Wyzanezan opened 3 days ago

tylermaran commented 3 days ago

Hey @Wyzanezan. Can you share an example of the type of page you're thinking of?

Right now we're not isolating images out of documents, but it's been something of interest to people. Today we'd just get [image description](image)

Wyzanezan commented 2 days ago

image Hi @tylermaran , the image above is part of a PDF page. When I use zerox, I hope zerox can recognize it as a picture and extract the picture and put it in a separate directory, but zerox can't do it at present.