ArtifexSoftware / pdf2docx

Open source Python library for converting PDF to DOCX.
https://pdf2docx.readthedocs.io
GNU Affero General Public License v3.0
2.46k stars 356 forks source link

[WARNING] Ignore Line "<image>" due to overlap #278

Closed GohanHango closed 1 week ago

GohanHango commented 5 months ago

I'm getting an output that isn't accurate. Some images aren't on the same space as the original PDF.

Here is a sample: Before: image

After: image

There image multiplied by two. The first one being compressed and the second one being under the first image.

greendreamer commented 1 week ago

Closing this for lack of reaction for an extended amount of time. Feel free to open a new issue - however please with a reproducing example.