johndoe31415 / pdfminify

PDF minifier that allows removing duplicate data, re-compresses images, creation of PDF/A-1b and digital PDF signing
GNU General Public License v3.0
55 stars 11 forks source link

IndexError: bytearray index out of range #16

Open yatrik-cloud opened 1 year ago

yatrik-cloud commented 1 year ago

What is the issue?

I have just given two arguments, still encountered this error. Even though I have tried changing the input pdf files, still facing the same issue. Kindly address it.

pdfminify version 0.2.1; llpdf version: 0.0.5

(venv) PS D:\python_projects\pdf compression r&D\venv\Scripts> pdfminify "D:\python_projects\pdf compression r&D\66page.pdf" "D:\python_projects\pdf compression r&D\out.pdf" Traceback (most recent call last): File "D:\python_projects\pdf compression r&D\venv\Scripts\pdfminify-script.py", line 33, in sys.exit(load_entry_point('pdfminify==0.2.1', 'console_scripts', 'pdfminify')()) File "D:\python_projects\pdf compression r&D\venv\Scripts\pdfminify-script.py", line 25, in importlib_load_entry_point return next(matches).load() File "C:\Users\Yatrik.s\AppData\Local\Programs\Python\Python310\lib\importlib\metadata__init__.py", line 171, in load module = import_module(match.group('module')) File "C:\Users\Yatrik.s\AppData\Local\Programs\Python\Python310\lib\importlib__init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "D:\python_projects\pdf compression r&D\venv\lib\site-packages\pdfminify\main__.py", line 139, in pdf = llpdf.PDFReader().read(args.infile) File "D:\python_projects\pdf compression r&D\venv\lib\site-packages\llpdf\PDFReader.py", line 133, in read hdr_version = self._read_identifying_header(f) File "D:\python_projects\pdf compression r&D\venv\lib\site-packages\llpdf\PDFReader.py", line 40, in _read_identifying_header if (after_hdr[0] != ord("%")) or any(value & 0x80 != 0x80 for value in after_hdr[ 1 : 5 ]): IndexError: bytearray index out of range

All Info Provided

AhmedThahir commented 8 months ago

Same issue I'm facing. Any updates?

johndoe31415 commented 8 months ago

The reason the third checkbox in "All Info Provided"

image

is there is because without a PDF to test it, I am completely unable to reproduce the error. As such, there's nothing I can do to even replicate the issue.