Closed tamdao closed 2 months ago
Your PDF contains multiple areas with a white background - written as vector graphics. Version 0.0.8 has an improved logic that ignores vector graphics which only consist of background coloring. Significant vector graphics are converted to images and treated like images. A significant vector graphic must also contain stroked drawings (which are not part of the drawing rectangle's border).
@JorjMcKie Thanks for your response. I apologize for the confusion; I checked the wrong version.
I use v0.0.8 with
to_markdown(doc)
by default write_images=False But the markdown syntax for images is always included in the Markdown output.saint.pdf output.md
There is another issue with this file. When I set write_images=True, it doesn't work correctly. Even though the file doesn't contain any images, the result includes some white images.