Trailing spaces and NUL characters in PDF cause failure identifying EOF

mfenniak / pyPdf

Pure-Python PDF Library; this repository is no longer maintained, please see https://github.com/knowah/PyPDF2/ insead.

https://github.com/knowah/PyPDF2/

Other

276 stars 85 forks source link

Trailing spaces and NUL characters in PDF cause failure identifying EOF #19

Closed freakboy3742 closed 13 years ago

freakboy3742 commented 13 years ago

I have a collection of PDFs that contain a line of NUL and space characters on the line after the %%EOF marker. The current technique for identifying the %%EOF fails on these PDFs because the 'while not line' check on line 704 of pdf.py (the start of the read() method on PdfFileReader) isn't sufficient to identify this line of NUL and spaces as something worth ignoring.

freakboy3742 commented 13 years ago

Closing as a duplicate of #20, which has a pull request attached.