pymupdf / RAG

RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
https://pymupdf.readthedocs.io/en/latest/pymupdf4llm
GNU Affero General Public License v3.0
539 stars 82 forks source link

adding dpi setting and fixing remaining image output #31

Closed jakubkovac closed 4 months ago

jakubkovac commented 5 months ago

I believe writing remaining images should be done with the vg_clusters variable and was a typo in the code.

Additionally this adds a dpi argument to .to_markdown() this is beneficial if any OCR is needed after the images are extracted as they could be in really low resolution.

Let me know what you think.