Open 1339503169 opened 5 days ago
There is a difference in the behavior of the base library. I am going to transfer this report to MuPDF's issue tracker and report the tracking number here.
Test outputs: mutool-12311.txt mutool-12404.txt
MuPDF issue number: https://bugs.ghostscript.com/show_bug.cgi?id=707843
Description of the bug
mscbookin.pdf
![image](https://github.com/pymupdf/PyMuPDF/assets/22074904/0d83ea83-2bca-4eaa-a624-d2286993cd34)
I encountered an issue while processing the file, where the string obtained using the get_text() method was missing some data compared to the original PDF
The reason why the coordinate information is multiplied by 2 is because I applied double scaling when generating the image
How to reproduce the bug
PyMuPDF version
1.24.5
Operating system
Windows
Python version
3.8