pts / pdfsizeopt

PDF file size optimizer
GNU General Public License v2.0
750 stars 65 forks source link

Acrobat Reader 2022.003.20322 cannot display some pages in PDFs generated by pdfsizeopt #162

Open Adreitz opened 1 year ago

Adreitz commented 1 year ago

Hello. I've faced an issue off and on, but seemingly getting more frequent, where pdfsizeopt produces output that Acrobat Reader doesn't like. In some cases, certain contents do not render and in others entire pages get lost. (Note that I can't guarantee that these two symptoms come from the same bug.) In either case, Reader reports the error message, "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem."

Attached, please find a test PDF that I've extracted from a larger PDF. In this case, running pdfsizeopt on Windows 10 with the command "pdfsizeopt.exe --use-image-optimizer=sam2p,jbig2" produces the attached corrupt output. One image does not render (Reader version 2022.003.20322). I have what I believe to be the newest Windows pdfsizeopt files, downloaded just a few days ago. Thanks!

test.pdf test_corrupted.pdf

pts commented 1 year ago

Thank you for reporting this bug!

Unfortunately, I can't reproduce the problem. For me, both test.pdf and test_corrupted.pdf render identically in Google Chrome and Evince (GNOME PDF viewer) and Ghostscript 9.26. I don't have easy access to a Windows 10 computer (for development), and I don't have Acrobat Reader 2022.003.20322 installed on my Linux computers.

Could you please attach some screen shots of the missing images and a screen shot of the Acrobat Reader error message?

So far it's a plausible explanation that Acrobat Reader is buggy. To disprove that, we'd need at least one PDF viewer which is unable to display test_corrupted.pdf correctly. Could you please try it in Sumatra PDF and Foxit on Windows?

Adreitz commented 1 year ago

Thanks for checking this out. This is interesting. I've attached small screenshots showing how Adobe only partially renders the page on loading the PDF, but then can be made to complete the render (minus the problem image) by changing the zoom level. Error message and initial render after page refresh

I don't have admin access on my work computer, so I can't install any software, but I was able to download the portable version of Sumatra. I also tried loading the PDFs using the built-in PDF readers within Chrome and Edge.

All of the non-Adobe PDF readers opened the files and rendered the images with no error messages. However, they rendered them different, both from the original and from each other. The rendering variation is subtle in this test PDF, so see attached for an alternate test PDF which is more obvious, as well as screenshots of zoomed page sections from Acrobat, Chrome, and Sumatra. (Apparently, Adobe's definition of 400% zoom is different from Chrome's or Sumatra's.) Note also that Sumatra took a long time (maybe about 30 seconds, I didn't time it) to open the "corrupt" PDF. The second test file was processed with the same pdfsizeopt command from my post above.

test2.pdf test2_corrupted.pdf Acrobat original: Acrobat orig

Acrobat corrupt: Acrobat corrupt

Chrome original: Chrome orig

Chrome corrupt: Chrome corrupt

Sumatra original: Sumatra orig

Sumatra corrupt: Sumatra corrupt

I hope this is helpful with troubleshooting.

Adreitz commented 1 year ago

Sorry for the double post, but here is an example where whole pages go missing (at least in Acrobat Reader). I wasn't able to reproduce this behavior in either Chrome or Sumatra -- they both rendered the processed files seemingly correctly.

Interestingly, Reader's behavior here depends on the number of pages I extract. If I process the whole PDF, which I didn't want to upload here, pages 1-3 load and render, 4-7 are collapsed to a small white square, and 8 renders. lost pages I believe there may be another set of pages that fail to render near the end of the PDF.

However, if I extracted just pages 3-8 and processed that PDF, Reader rendered all of the pages! Processing the first eight pages of the document caused Reader to fail to render pages 4-8 and processing the first nine pages caused Reader to fail to render pages 4-9. (The below are the first nine pages before and after processing.) test3.pdf test3.pso.pdf

Processing the first ten pages, though, Reader again renders them all! It's weird behavior. test4.pdf test4.pso.pdf

jul059 commented 1 year ago

I have the same problem with Adobe Acrobat Pro 11.0.20.17. From my non-developer perspective, it looks similar to the bug 114 I filled previously.

Here is another file producing a "(14)" error:

pdfsizeopt v8, "pdfsizeopt.exe --use-image-optimizer=sam2p,jbig2"

sample_display_ok.pdf sample_display_not_ok.pdf

image