Open willfill opened 13 years ago
It isn't clear from the PDF spec whether duplicate keys should be allowed: http://pdf.editme.com/pdfua-docinfodictionary http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf (Section 10.2.1). The terminology (dictionary, key/value) seems to imply unique keys. It is clear that some programs are creating documents with duplicate keys making them unreadable by PyPDF due to this issue.
i have some code :
import pyPdf
def getPDFContent(): content = ""
Load PDF into pyPDF
f = open(pathToTxt,'w+') f.write(getPDFContent()) f.close()
where pathToPdf and pathToTxt it is absolute path to the files. but i got error : Traceback (most recent call last): File "C:/Users/will/Desktop/coding/mytest.py", line 21, in
print pdf.getPage(14)
File "C:\Python\lib\site-packages\pyPdf\pdf.py", line 450, in getPage
self._flatten()
File "C:\Python\lib\site-packages\pyPdf\pdf.py", line 607, in _flatten
self._flatten(page.getObject(), inherit, **addt)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 165, in getObject
return self.pdf.getObject(self).getObject()
File "C:\Python\lib\site-packages\pyPdf\pdf.py", line 649, in getObject
retval = readObject(self.stream, self)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 67, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 531, in readFromStream
value = readObject(stream, pdf)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 67, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 531, in readFromStream
value = readObject(stream, pdf)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 67, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "C:\Python\lib\site-packages\pyPdf\generic.py", line 534, in readFromStream
raise utils.PdfReadError, "multiple definitions in dictionary"
pyPdf.utils.PdfReadError: multiple definitions in dictionary