Open thomassajot opened 3 years ago
This is likely due to pdfrw
, the underlying library that pdf_annotate uses to read, edit, and write PDFs. You could try reading in and writing back out that file using just pdfrw and see if the pages are missing.
I also found the similar problem and it comes from PdfReader as below. (Actually test.pdf has 19 pages)
>>> from pdfrw import PdfReader
>>> from PyPDF2 import PdfFileReader
>>> filename = './test.pdf'
>>> pdf_reader = PdfReader(filename)
>>> len(pdf_reader.pages)
2
>>> pdf_file_reader = PdfFileReader(open(filename, 'rb'))
>>> pdf_file_reader.getNumPages()
19
>>> from PyPDF3 import PdfFileReader
>>> pdf_file_reader = PdfFileReader(open(filename, 'rb'))
>>> pdf_file_reader.getNumPages()
19
I raised this issue on that repo and I'm still waiting for their answer, but I'm wondering if I can get an answer because there have been no changes since 2018.
Can't use preexisting streams like pyPdf while initializing PdfReader
Could you allow or change PdfAnnotator to use PdfFileReader and PdfFileWriter from PyPDF3, which is a fork of PyPDF2 and is still actively improved?
Hello, Surprisingly, some pages are missing when using pdf_annotate: Example pdf from https://www.hkexgroup.com/-/media/HKEX-Group-Site/ssd/Investor-Relations/Regulatory-Reports/documents/2016/160321ar_e.pdf?la=en , with 212 pages.
when running the following code, the new files is missing 2 pages. The second and previous to last pages. Any idea why ?