Open packdat opened 5 months ago
Deleted objects are now handled when saving incrementally.
Note: Still untested with encrypted documents due to a lack of time.
Hey @packdat - Can you share multiple pages and multiple sign/stamp on same page, example code ?
OR please guide how to achieve that.
I was recently tasked to evaluate the possibility to "stamp" existing documents. A "stamp" is literally an image of an actual stamp that should be added to specific pages of a document. Problem is, the document may be signed, so the stamp has to be added in a non-destructive manner to keep the signature intact.
I started to hack around and was able to come up with something that seems to work.
The idea was to just track changes to arrays (
PdfArray
) and dictionaries (PdfDictionary
). All other objects are basically immutable so this approach should work in theory. Also, a newPdfDocumentOpenMode
was added, namelyAppend
. When a document is opened in this mode, it starts to track changes to arrays and dictionaries. When saving the document, only changed/added objects are saved; the changes are appended to the existing document.Basic code (taken from the included test-case):
There may be more that is needed to work consistently (i.e. i haven't tested with encrypted documents yet as i was told the documents i have to work with will not be encrypted). This change also does not handle the case, where object were deleted from a document. These objects would need to be tracked separately as they would need special entries in the new XREF-table.
One potential issue i spotted was the fact that library modifies the document by just reading certain properties; thus accidentally marking those objects as "modified" although you haven't changed anything. One example are the
*Box
- properties ofPdfPage
(e.g.TrimBox
,CropBox
, ...) If you read these properties and the document does not already contain values for them, a new value is added to the underlying dictionary.I haven't looked too deeply but i expect there are more cases like that. I have changed the
*Box
-properties to just returnPdfRectangle.Empty
when there is no value instead of adding a new value.There is also the case with type-transformations (e.g. exchanging a
PdfDictionary
with a more specific type likePdfPage
). These transformations happen "under the hood" and would normally also cause objects to be marked as modified. I tried to prevent that by temporarily ignoring changes while doing the type-transformations by using the new methodThis is quite "hack-ish", maybe you have better ideas on how to tackle this ?