Open ambigus9 opened 2 years ago
I trying to write PDF file to do that i using following code:
from PyPDF3 import PdfFileWriter, PdfFileReader import boto3 s3 = boto3.resource("s3") bucket = s3.Bucket(my_s3Bucket_on_AWS) object = bucket.Object(my_s3_file_on_AWS) tmp = tempfile2.NamedTemporaryFile() inputpdf = PdfFileReader(open(tmp.name, "rb"), strict=False) num_pages = inputpdf.getNumPages() output = PdfFileWriter() for i in range(num_pages): logger.info(f"Adding page --> {i}") output.addPage(inputpdf.getPage(i)) logger.info(f"Here getting UserWarning") with open(tmp2.name, "wb") as output_stream: output.write(output_stream) output_stream.close()
Works perfect for at least 10K of PDFs, until 1 PDF that is getting following error:
UserWarning: Unable to resolve [IndirectObject: IndirectObject(7, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(9, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(10, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(13, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(16, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(20, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(24, 0)], returning NullObject instead [pdf.py:644] UserWarning: Unable to resolve [IndirectObject: IndirectObject(29, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(7, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(9, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(10, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(13, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(16, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(20, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(24, 0)], returning NullObject instead [pdf.py:644]
UserWarning: Unable to resolve [IndirectObject: IndirectObject(29, 0)], returning NullObject instead [pdf.py:644]
Any suggestion about how to fix this?
Note: The PDF i trying to read is not empty, it have data.
I trying to write PDF file to do that i using following code:
Works perfect for at least 10K of PDFs, until 1 PDF that is getting following error:
Any suggestion about how to fix this?
Note: The PDF i trying to read is not empty, it have data.