algoo / preview-generator

generates previews of files with cache management
https://pypi.org/project/preview-generator/
MIT License
224 stars 49 forks source link

Bug: get_page_nb breaks on some pdf's #259

Closed mecharmor closed 2 years ago

mecharmor commented 3 years ago

Description and expectations

Using get_page_nb on a valid pdf throws an exception

`

"Traceback (most recent call last): File ".../FileHandler.py", line 89, in doc_2_image tmp.name, height=height, width=width, force=True, page=page File ".../site-packages/preview_generator/manager.py", line 203, in get_jpeg_preview mimetype=preview_context.mimetype, File ".../lib/python3.7/site-packages/preview_generator/preview/builder/pdfpypdf2.py", line 69, in build_jpeg_preview input_pdf = utils.get_decrypted_pdf(pdf, strict=False) File ".../lib/python3.7/site-packages/preview_generator/utils.py", line 189, in get_decrypted_pdf pdf = PdfFileReader(stream, strict, warndest, overwriteWarnings) File ".../lib/python3.7/site-packages/PyPDF2/pdf.py", line 1084, in init__ self.read(stream) File ".../lib/python3.7/site-packages/PyPDF2/pdf.py", line 1689, in read stream.seek(-1, 2) OSError: [Errno 22] Invalid argument" `

How to reproduce

Use this pdf with function get_page_nb. My exact usage: pm.get_page_nb() 2_break_preview_gen.pdf

Version information

inkhey commented 3 years ago

Hello, thanks for reporting this bug. Unfortunately our pdf builder based on both qpdf and pypdf2 doesn't work for your file. I will take a look to see if the reason is pypdf2, preview-generator or qpdf based.

Please note also that this builder should be replaced in the future by something more resilient : https://github.com/algoo/preview-generator/issues/192 .

mecharmor commented 3 years ago

Thank you @inkhey for looking into this. So my main issue is that with that when get_page_nb fails I am unable to use get_jpeg_preview properly to specify the page of -1 or 1. strangely enough when I use a pdf with a single page by specifying page 1 it will raise an exception (I can elaborate if needed).

inkhey commented 2 years ago

Tested, this has been fixed with https://github.com/algoo/preview-generator/pull/275