Open anirudhagarwal1 opened 1 month ago
Providing the example file (not just the pictures) is mandatory for submitting a bug.
Since this document contains some sensitive information, I would not able to share it on a public forum. I tried to replicate this issue with multiple other PDFs and wasn't able to.
Would you consider if I could mail it to you privately?
Since this document contains some sensitive information, I would not able to share it on a public forum. I tried to replicate this issue with multiple other PDFs and wasn't able to.
Would you consider if I could mail it to you privately?
Yes, certainly! Please do use this way.
I have shared the same over your github email id - jorj.x.mckie@outlook.de
I have the same issue. When processing a PDF of this paper, the title and table borders were removed. https://arxiv.org/abs/2310.19909 This problem does not occur when using v1.23.26.
I have the same issue. When processing a PDF of this paper, the title and table borders were removed. https://arxiv.org/abs/2310.19909 This problem does not occur when using v1.23.26.
Please provide the link to an example PDF / page - I need it to report the bug!
@JorjMcKie Sorry, I should have been more explicit. The following URL is the link to the PDF. https://arxiv.org/pdf/2310.19909 Page 1, 4, 7, and 8 borders disappear.
Problem file: notext.pdf
MuPDF issue reference: https://bugs.ghostscript.com/show_bug.cgi?id=707840
@JorjMcKie Sorry, I should have been more explicit. The following URL is the link to the PDF. https://arxiv.org/pdf/2310.19909 Page 1, 4, 7, and 8 borders disappear.
This specific file seems to be no issue (anymore in recent version). The test file above still is a problem.
Description of the bug
I have a single page pdf file which has a table inside it. When I load the pdf and try to call the get_pixmap function, it just keeps the content and removes the table around it.
pix = page.get_pixmap(alpha=False, dpi=150) image = Image.open(io.BytesIO(pix.tobytes())) image.save("temp.jpeg", format='jpeg')
Unfortunately, I won't be able to share to share this particular pdf on an open platform, would you be able to suggest how can I further debug it?
Sharing the part of screenshot of this pdf and the converted image. PDF -
Image from it -
How to reproduce the bug
Seems to be breaking only in this particular kind of PDF. Seems to be working fine elsewhere.
PyMuPDF version
1.24.1
Operating system
MacOS
Python version
3.10