Closed Cozokim closed 1 month ago
There is a size limit for images to be considered for extraction.
This is to prevent tiny little dirt to become honored with a file output.
The default is image_size_limit=0.05
which ignores images having width (height) < 0.05 * page.rect.width (height)
.
Set this to a smaller number if you want to see more images.
Hi, I had a funcional version with pymupdf ==1.24.9 and pymupdf4llm==0.0.16
But now, When I do a new env from 0 with pymupdf ==1.24.10 and pymupdf4llm== 0.0.17 and I use pymupdf4llm.to_markdown(write_images=True), some images from my PDF are not extracted.
Downgrading manually the library didn't fix the problem, but instaling from scratch with version pymupdf ==1.24.9 and pymupdf4llm==0.0.16 did so I assume it's about another library installed during the pip install of those versions.
I was allowed to fix the problem, but it was just so you know :)
Cheers