Visible signature loses its state

LibrePDF / OpenPDF

OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.

Other

3.56k stars 586 forks source link

Visible signature loses its state #296

Closed bsanchezb closed 3 years ago

bsanchezb commented 4 years ago

Hello,

We faced with an issue after the last commit from 21.10.2019 (7e5a8f0) when creating a visual signature. The problem is that the code produces two different outputs when I create a data-to-be-digested and the signed document.

The code of the library is quite complex, so I cannot find an issue by myself.

The examples of two produced files are in the attachment. The problem seems to be in objects of types ObjStm and XRef.

Please, have a look on it before the next release, because the problem is very critical (HASH_FAILURE for all PAdES with a visible signature). Before the specified commit everything worked like a charm.

Regards, Aleksandr.

toBeDigested.pdf signedDoc.pdf

sixdouglas commented 4 years ago

@bsanchezb Acrobat complains when trying to open the toBeSigned.pdf file. The document may be corrupted. Though I can open it in Chrome. Can you add, here the code you used to generate it? What would you say by:

the code produces two different outputs

bsanchezb commented 4 years ago

@sixdouglas The Adobe complains because it is not a document "toBeSigned", it is a pre-computed data, which will be hashed and actually signed (if to open the document with a text-editor, you will see that it contain an empty "contents" element).

In DSS (a digital signature application), we firstly compute a digest to be signed, and then we actually sign the document. In PAdES the signature covers the entire document, therefore all signature-related content must be included to a PDF itself (all B-level attributes and dictionaries are included into the PDF, keeping only empty "contents" attribute, with a fixed content length (for CMS Signed Data). If you will compare the content of two files in a text-editor, you will see that they have a difference in ObjStm and XRef objects. Therefore, when you compute two files with the same content inside (only the "contents" is filled in the second case), the OpenPDF code produces two different outputs.

We never had this problem before, and it began after the mentioned above commit.

mkl-public commented 4 years ago

@bsanchezb

I just inserted the signature value from your signedDoc.pdf into your toBeDigested.pdf resulting in toBeDigested-WithSignature.pdf. It also is broken. Thus, you do have other issues, too.

The differences in the object stream is the order in which two objects with identical content appear (two graphics state resources). This implies a difference in the cross reference stream as it contains the (optional) position inside object streams.

In my opinion you should change your approach and store the initially created PDF (which you currently drop after hashing) temporarily to eventually inject the signature container therein instead of in a newly generated copy you hope to be identical.

PDF is not a format that requires byte-wise identical files for the same input; actually on the contrary it expects each distinct run of PDF generation or manipulation to have a distinct ID; OpenPDF supports this and has introduced an override for that explicitly for eSig DSS.

Nonetheless byte-wise identical outputs usually aren't a development target of PDF libraries. So changes that result in differences like here will keep occurring, not only in OpenPDF but also in PDFBox, your alternative PDF backend.

Thus, that change of approach will eventually reduce efforts on your side, too.

As mentioned initially, this is my opinion and does not represent what the maintainers or contributors here think or plan.

bsanchezb commented 4 years ago

Dear @mkl-public , thank you for your comments. Of course the signature will be broken, because these two documents have different content (exactly by a reason what you described afterwards). We do a signature in three steps:

get dataToSign (toBeDigested);
get signature value (sign the digest);
generate the signature document. And only in case of PAdES the dataToSign is equal to the output signed document (without a signature value). It will be not a simple fix and the last changes in OpenPDF make a further migration to the next version complicated.

bsanchezb commented 4 years ago

Hello Michael,

I re-checked the case today morning, indeed, yesterday I understood wrongly what you did. The signature has a problem with a certificate there.

To be consistent I attach new files created with the develop branch of OpenPDF. And the merged result "toBeDigested-WithSignature". toBeDigested.pdf signed.pdf toBeDigestedSigned.pdf

mkl-public commented 4 years ago

Do you have a test case with which one can reproduce the issue?

bsanchezb commented 4 years ago

@mkl-public you can try to run PAdESSignatureField test with OpenPDF version 1.3.12-SNAPSHOT

mkl-public commented 4 years ago

@bsanchezb

A work-around for your issue is to replace

protected HashMap<PdfDictionary, PdfObject[]> documentExtGState = new HashMap<>();

in PdfWriter by

protected Map<PdfDictionary, PdfObject[]> documentExtGState = new LinkedHashMap<>();

which will make sure that the ExtGState dictionaries will be stored in the order they were created, not in an unpredictable hashing order.

Similarly there is a number of other HashMap members in PdfWriter which also have to be made ordered to support a reproducible order of objects in arbitrary created PDFs, e.g. for shading and pattern resources.

Interestingly there is one other mapping which already has been made ordered before:

protected LinkedHashMap<BaseFont, FontDetails> documentFonts = new LinkedHashMap<>();

The fonts have been made ordered in a check-in dated 2008-08-29 by Paulo Soares (commit 5faf13c8205e53991a56a51e3679d6c946bfa001 in the more complete iText repositories) with the comment

Fonts are output in the order they are added.

Apparently there was an interest in a more predictable order of objects already earlier in history...

Nonetheless I'd call this a work-around for your specific usage of OpenPDF and not a fix because iText/OpenPDF never guaranteed byte-wise identical outputs for identical inputs, merely equivalent ones as PDF documents.

bsanchezb commented 4 years ago

@mkl-public thank you for investigations. It works indeed.

asturio commented 3 years ago

Hi @bsanchezb and @mkl-public , just making an issue-grooming. I think this is already fixed in #304 , isn't it ?

bsanchezb commented 3 years ago

Hello @asturio ,

Yes, the issue has been fixed in the pull-request #304 .

I will close the issue.

Cheers, Aleksandr.