sfneal / PyPDF3

A utility to read and write PDFs with Python
https://pythonhosted.org/PyPDF2/
Other
72 stars 15 forks source link

Supressing errors that are generated when processing #23

Open blackdwarf opened 8 months ago

blackdwarf commented 8 months ago

Hi,

I am using PyPDF3 to extract metadata from a bunch of PDF files I have on my drive. It is working pretty well, but I am running into an issue that it keeps outputting stuff to STDOUT like below:

invalid pdf header: b'Comun'
incorrect startxref pointer(3)

I understand that the error is raised because there are errors in the PDF file that is being parsed, and that is fine.

What I've tried:

  1. Passing strict=False to PdfReader object at construction time. According to PdfReader docs it is already False by default, but I thought it couldn't hurt.
  2. Setting the logging levels for the PyPDF2 logger as explained in the documentation.

None of the two things worked, so I'm a bit at a loss of how to stop these errors (or log them to a different place).

Does someone have a way how to do this that works? Thanks!