This PR attempts to resolve the issues described in #73 and #46 in a more generic way.
It also supersedes #53 by removing the need to handle objects stored in object-streams in a special way.
The "lazy loading" aspect is handled by the new class PdfReferenceToCompressedObject, which is a sub-class of PdfReference.
While processing the document's xref-streams, references to objects stored in object-streams are collected in the form of the mentioned PdfReferenceToCompressedObject.
When accessing the Value of such a reference (which may occur while parsing another object which contains a reference to the compressed object), the object-stream is loaded and decrypted (if not already done) and the actual object is read from the object-stream.
Have not found any issue so far running automated tests with these changes against ~1000 PDF-files (testing page-import).
Note:
The PR also includes some minor tweaks not directly related to object-loading, which i think are helpful.
(like reporting the position within a document where an unexpected token was encountered during parsing)
This PR attempts to resolve the issues described in #73 and #46 in a more generic way. It also supersedes #53 by removing the need to handle objects stored in object-streams in a special way.
The "lazy loading" aspect is handled by the new class
PdfReferenceToCompressedObject
, which is a sub-class ofPdfReference
. While processing the document's xref-streams, references to objects stored in object-streams are collected in the form of the mentionedPdfReferenceToCompressedObject
. When accessing theValue
of such a reference (which may occur while parsing another object which contains a reference to the compressed object), the object-stream is loaded and decrypted (if not already done) and the actual object is read from the object-stream.Have not found any issue so far running automated tests with these changes against ~1000 PDF-files (testing page-import).
Note: The PR also includes some minor tweaks not directly related to object-loading, which i think are helpful. (like reporting the position within a document where an unexpected token was encountered during parsing)