pmaupin / pdfrw

pdfrw is a pure Python library that reads and writes PDFs
Other
1.84k stars 271 forks source link

Unexpected AssertionError on malformed PDF #211

Open Google-Autofuzz opened 3 years ago

Google-Autofuzz commented 3 years ago

When running the following code with the latest pypi version of pdfrw on the attached input results in an unexpected AssertionError:

import sys
from pdfrw import PdfReader

with open(sys.argv[1], 'rb') as f:
    PdfReader(io.BytesIO(f.read()))
$ python3 pypdf2_repro.py ../test.pdf
Traceback (most recent call last):
  File "pdfreader_repro.py", line 6, in <module>
    PdfReader(io.BytesIO(f.read()))
  File "/home/user/.local/lib/python3.8/site-packages/pdfrw/pdfreader.py", line 611, in __init__
    startloc, source = self.findxref(fdata)
  File "/home/user/.local/lib/python3.8/site-packages/pdfrw/pdfreader.py", line 333, in findxref
    assert tok == 'startxref'  # (We just checked this...)
AssertionError
$ 

test.pdf