LibrePDF / OpenPDF

OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.
Other
3.6k stars 596 forks source link

PAdES signatures support #86

Open andreasrosdal opened 6 years ago

andreasrosdal commented 6 years ago

PAdES support in OpenPDF would be nice to verify the authenticity of PDF documents such as invoices.

Search for OpenPDF here: https://ec.europa.eu/cefdigital/DSS/webapp-demo/doc/dss-documentation.html

https://en.wikipedia.org/wiki/PAdES

http://www.etsi.org

https://librepdf.github.io/OpenPDF/docs-1-1-0/com/lowagie/text/pdf/PdfStamper.html

https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf

https://developers.itextpdf.com/examples/security-itext5/digital-signatures-white-paper/digital-signatures-chapter-2

This is highly relevant: PAdES, with a LGPL license: https://github.com/esig/dss https://github.com/esig/dss/tree/master/dss-pades

Ongoing working for OpenPDF integration in DSS: https://github.com/esig/dss/tree/openpdf-integration

Lonzak commented 6 years ago

I am highly interested in this. What does the closing mean? Is it already implemented? Was it discarded?

andreasrosdal commented 6 years ago

Thanks. Reopened. Perhaps you want to help work on this? Any help is welcome, for example, programming and submittting patches, or collecting information and describing requirements specifications.

Lonzak commented 6 years ago

I am following this project for quite some time now since we are maintaining our own fork for nearly 10 years now. My plan would be to create a (probably bigger) patch to migrate our fixes and additions to this project so that we can stop our fork. (e.g. we had done the BC update, too)... For PAdES I fear I currently don't have the time to work on, since this is not a small topic... I mean we could start small by only supporting the Basic profile in the beginning. We should probably ask MKL to help out here :-)

andreasrosdal commented 6 years ago

@Lonzak Your patches would be very welcome! Since you have been maintaining your own fork, perhaps you would be interested in commit access to this project also?

How do we start implementing the basic PAdES profile?

Lonzak commented 6 years ago

Commit access would be great.

For the basic profile we would need to (1) Analyse the existing signing function (2) Understand the PAdES way of signing (e.g. the differences to the old way) (3) Adapt the existing signing to support PAdES or (4) Completely rewrite the signing code since it is quite a mess (this is what the ITeXT boys have been done)

Lonzak commented 6 years ago

This is highly relevant: PAdES, with a LGPL license: https://github.com/esig/dss https://github.com/esig/dss/tree/master/dss-pades

Yes I have already worked with that project but they are using PdfBox. They switched away from iText a while ago due to the license change. But they developed a PAdES signing component. So maybe an external signing process using OpenPDF would be possible. This could be a way...

pvandenbroucke commented 6 years ago

Hello,

This fork is interesting. As Lonzak said, DSS used iText its old versions. It should be possible to re-integrate it in DSS with a new module (dss-pades-pdfbox / dss-pades-openpdf).

Do you have any documentation about the evolutions since the fork creation ?

Regards,

Pierrick

andreasrosdal commented 6 years ago

This fork is interesting. As Lonzak said, DSS used iText its old versions. It should be possible to re-integrate it in DSS with a new module (dss-pades-pdfbox / dss-pades-openpdf).

Integrating DSS with OpenPDF by creating dss-pades-pdfbox / dss-pades-openpdf seems like a great idea! That would give users of DSS the choice of which PDF library to use.

If you want to help improve the signatures support in OpenPDF, that would be very welcome!

Do you have any documentation about the evolutions since the fork creation ?

https://github.com/LibrePDF/OpenPDF/releases

https://github.com/LibrePDF/OpenPDF/pulls?q=is%3Apr+is%3Aclosed

andreasrosdal commented 6 years ago

It should be possible to re-integrate it in DSS with a new module (dss-pades-pdfbox / dss-pades-openpdf).

@pvandenbroucke How do you suggest we begin in order to do this?

pvandenbroucke commented 6 years ago

Hello Andreas,

I'll firstly try to split the dss-pades module into two modules (api / pdfbox implementation), than retrieve the previous integration of iText and analyze what has been changed since the integration. The goal is to get a common API and two implementations (openpdf and pdfbox).

I'll soon create a new branch with the refactoring and integration. Feel free to contribute.

Regards,

Pierrick

pvandenbroucke commented 6 years ago

Hello,

I started the OpenPDF integration in DSS. The related branch is : https://github.com/esig/dss/tree/openpdf-integration

I'm facing to an issue to generate the data to be signed. If I compute twice the same operation with the same parameters, I get different digests. I identified the difference in the PDF file : trailer <</Info 40 0 R/ID [<ca4b0fd5553ebacf2d6fb09534ae4cca><979a9613d657bc6dcefdbca81e43e1a1>]/Root 39 0 R/Size 46/Prev 135809>> startxref

The generated fileID is different for each calls. Here is the related line in the OpenPDF code. I added unit tests which fail due to this fileID change.

Do you have an idea to stabilize the fileID ? Feel free to contribute too.

Regards,

Pierrick

PS: The integration is very minimalist. Many parts are missing (VRI/DSS dictionaries, visible signatures,...) and other need to be rewritten.

andreasrosdal commented 6 years ago

The fileID is hashed using the current time, so it will currently always be unstable. This is done in PdfEncryption.createDocumentId(), source code here: PdfEncryption.

It should be easy to change this so that fileID will be stable. What would the best way for you be to specify that you want a stable fileID? Or should we change OpenPDf so that PdfEncryption.createDocumentId() always returns a stable document ID regardless of time?

pvandenbroucke commented 6 years ago

Yes, I saw that in the code and the incremented static seq.

If you find a convenient way to specify that in the PdfStamper, it would be nice.

andreasrosdal commented 6 years ago

What do you think of this proposal?

https://github.com/LibrePDF/OpenPDF/commit/d714c6a3b2b41779d7d693df68aeb841f7cec27f

pvandenbroucke commented 6 years ago

That can be a solution but that needs to be exposed in the PdfStamper. The class PdfStamperImp seems to be an internal class.

I don't know very well iText and OpenPDF and how the code evolved.

Edit : I locally tried. The method PdfEncryption.createInfoId(overrideFileID); calls createDocumentId(); which generates a documentId based on time and the static seq. We have the same issue.

mkl-public commented 6 years ago

That can be a solution but that needs to be exposed in the PdfStamper.

PdfStamper has a method getWriter via which you can retrieve the PdfStamperImp instance (PdfStamperImp is derived from PdfWriter). Unfortunately, though, the class PdfStamperImp itself is not public. To be able to universally use those new methods, PdfStamperImp must either become public, or implement a yet to define interface with those methods, or those methods must already be declared in the parent PdfWriter class.

andreasrosdal commented 6 years ago

I have created a proposal where we can add includeFileID and overrideFileId properties to PdfStampler, which allows disabling the fileId and overriding the fileId.

Pull request: https://github.com/LibrePDF/OpenPDF/pull/94 Branch: dss-openpdf https://github.com/LibrePDF/OpenPDF/tree/dss-openpdf

Please let me know if this solves your problem. If you have other suggestions, then you can also try creating a pull-request which is very likely to be accepted. Thanks!

pvandenbroucke commented 6 years ago

Hello,

Thanks that allows me to specify the fileID. That's perfect for me.

I use the snapshot version in the same branch : https://github.com/esig/dss/tree/openpdf-integration

Regards,

Pierrick

andreasrosdal commented 6 years ago

Cool. Let me know if you need further changes. I can also create a new release of OpenPDF when you want to.

andreasrosdal commented 6 years ago

@pvandenbroucke What are the remaining changes required for a minimum first version? Do you know of any other changes required to OpenPDF?

-VRI/DSS dictionaries -visible signatures

pvandenbroucke commented 6 years ago

Yep, I need to implement them in the dss-pades-openpdf module. I think that features already exists (create a dictionary and add it ito an existing pdf file + visual signatures).

I'm now able to create a basic PAdES with OpenPDF and DSS (BASELINE_PROFILE) and currently trying to extend it with timestamps,... The evolutions are visible on the same branch.

Anyway, I have a question. How do you handle the file termination ? I mean how do you know how to correctly finish the file ? eg : %%EOF, %%EOF EOL,...

The release is not urgent for me. I'll need time to finish the integration and to test it. If you want to release a new version, feel free to do it. ;-)

Lonzak commented 6 years ago

Do you mean in the PDF file?

Up to PDF 1.7 it was: "The trailer of a PDF file enables a conforming reader to quickly find the cross-reference table and certain special objects. Conforming readers should read a PDF file from its end. The last line of the file shall contain only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order, the keyword startxref and the byte offset in the decoded stream from the beginning of the file to the beginning of the xref keyword in the last cross-reference section. "

So it is %%EOF in an otherwise empty line

2018-08-22 16:13 GMT+02:00 Pierrick Vandenbroucke notifications@github.com :

Yep, I need to implement them in the dss-pades-openpdf module. I think that features already exists (create a dictionary and add it ito an existing pdf file + visual signatures).

I'm now able to create a basic PAdES with OpenPDF and DSS (BASELINE_PROFILE) and currently trying to extend it with timestamps,... The evolutions are visible on the same branch.

Anyway, I have a question. How do you handle the file termination ? I mean how do you know how to correctly finish the file ? eg : %%EOF, %%EOF EOL,...

The release is not urgent for me. I'll need time to finish the integration and to test it. If you want to release a new version, feel free to do it. ;-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LibrePDF/OpenPDF/issues/86#issuecomment-415046088, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvtHpuWM1Mo0dGz8kAGRJ38jqUUlRdSks5uTWcBgaJpZM4VpDTn .

-- Beste Grüße/Kind regards Tobias van Treeck


Tobias van Treeck (MSc) nepatec GmbH . Seelhorststraße 44 . 30175 Hannover https://maps.google.com/?q=Seelhorststra%C3%9Fe+44+.+30175+Hannover&entry=gmail&source=g Fon: +49.511.935.946.51 . Fax: +49.511.935.946.57 . Mail: tobias.vantreeck@nepatec.de Internet: www.nepatec.de nepatec GmbH . Amtsgericht Hannover HRB 200954 . Geschäftsführer: Burkhard Gerlts . Claudius Grieser . Thorsten Geier Hinweis: https://www.nepatec.de/about/

pvandenbroucke commented 6 years ago

I met issues with DSS when I wanted to recover the original file from a signed PDF. Sometimes, I had a difference of one character (end-of-line after the %%EOF).

The original file can be ended with %%EOF or %%EOF EOL.

I'd like to know if OpenPDF handles these two kinds of termination when it adds a new layer (signature, timestamp,...).

(Here is the related issue on the Jira of PDFBox.)

andreasrosdal commented 6 years ago

https://github.com/LibrePDF/OpenPDF/blob/94f6a23b5dc40ea1705353d64576ed898508f7df/openpdf/src/main/java/com/lowagie/text/pdf/PdfWriter.java#L577

Does this answer your EOF question?

pvandenbroucke commented 6 years ago

Yes, it does.

IMO, the last end-of-line (\n) shouldn't be added. It could be preferable to add the end-of-line at the beginning of the new layer (if it is not already present).

I'll do some more tests these next days to see if I meet the same issue.

mkl-public commented 6 years ago

IMO, the last end-of-line (\n) shouldn't be added.

Cf. the PDFBox issue you referenced: Even the PDF/A people consider that EOL valid (but not required), and the PDF/A requirements can be quite a PITA in other respects.

pvandenbroucke commented 6 years ago

@andreasrosdal I'll do some tests. If I have the problem, I'll try to propose a PR.

Do you know which are the supported PDF standards by OpenPDF ?

FYI, your current merge should also help on this issue.

@mkl-public Do you see any limitation with my above logic (if we strictly refer to the ISO 32000) ? In DSS, we don't officially support PDF/A. We only offer a minimal support (warnings on alpha layer with visible signatures,...).

pvandenbroucke commented 6 years ago

Hello,

I'm facing to another similar issue than above when I generate twice the data to be signed with the same parameters.

The MODDATE value can be different. Here is the related line. A new instance of PdfDate is created for each calls. If there's more than one second between each call, the MODDATE won't be equals and break the signature.

I see two solutions :

What do you prefer ? Could you have a look ?

Thanks in advance,

Pierrick

andreasrosdal commented 6 years ago

retrieve the signingTime

How do you suggest we get the signingTime?

What do you prefer ? Could you have a look ?

I don't have a strong opinion for either alternative. If you can force it easily with code in dss-pades-openpdf, then that is fine with me. Is that okay for you?

pvandenbroucke commented 6 years ago

That's ok for me. I will specify the date in the Info dictionary.

If that's not specified, you can create a new instance like now.

andreasrosdal commented 6 years ago

So we change this, so that if a PdfName.MODDATE has already been set in the Info dictionary, then don't set a new one? Or are no changes required to OpenPDF as a result of this?

pvandenbroucke commented 6 years ago

This code :

PdfStamper stp = PdfStamper.createSignature(reader, output, '\0', null, true);
PdfWriter writer = stp.getWriter();
PdfDictionary info = writer.getInfo();
info.put(PdfName.MODDATE, new PdfDate(cal));

has currently no effect.

That's not a good idea. In case of multiple layers (more than one signature), the oldInfo will already have a MODDATE and it will be copied in the newInfo dictionary.

Do you have any idea ?

Edit : In this commit, I have a working solution. What do you think about ? It may interesting to have a specific new interface to handle external signing ?

mkl-public commented 6 years ago

@pvandenbroucke

Do you see any limitation with my above logic (if we strictly refer to the ISO 32000) ? In DSS, we don't officially support PDF/A. We only offer a minimal support (warnings on alpha layer with visible signatures,...).

I merely mentioned the PDF/A people because they usually are quite some nitpickers. If they accept the %%EOF both with and without a following end-of-line, this can be taken as a sign that neither variant can conclusively be proven to be wrong.

Furthermore, I am not aware of that end-of-line causing any issues.

All in all, therefore, I don't see a reason to change the current implementation.


In this commit, I have a working solution. What do you think about ?

That should work. But I would propose a different name for those methods, e.g. getEnforcedModificationDate and setEnforcedModificationDate. Otherwise an user who does not know why those methods came into being, would think he could retrieve the current value of the modification metadata with getModificationDate.

andreasrosdal commented 6 years ago

That should work. But I would propose a different name for those methods, e.g. getEnforcedModificationDate and setEnforcedModificationDate. Otherwise an user who does not know why those methods came into being, would think he could retrieve the current value of the modification metadata with getModificationDate.

Renamed to getEnforcedModificationDate and setEnforcedModificationDate in this commit: https://github.com/LibrePDF/OpenPDF/commit/b1132b3afdec8cb9bbe2dc79d9d546c90ab40765

Does this mean that DSS+OpenPDF can be used to sign PDF files with PAdES very soon now?

pvandenbroucke commented 6 years ago

Yes, it will be possible in the next release (version 5.4). I still need to do some more refactorings, the visible signature and integrate unit tests.

DSS is now able to produce PAdES Baseline Profiles B/T/LT/LTA with OpenPDF.

Thanks for your help

andreasrosdal commented 6 years ago

OpenPDF version 1.2.1 has been released, and includes these changes: https://github.com/LibrePDF/OpenPDF/releases/tag/openpdf-1.2.1 @pvandenbroucke I hope you'll be able to use this version in the openpdf-integration branch.

pvandenbroucke commented 6 years ago

Hello,

I continued the integration with the unit tests and the signature visible.

I meet some difficulties to reproduce the same visual signatures with OpenPDF. Depending of parameters, the class PdfSignatureAppearance may generate additional things (like text from the certificates,...). Do you have any sample which creates visual representations ?

Another more important issue is that visual generation doesn't guaranteed the objects order in the pdf file. I didn't find the root cause.

In attachments, you can find examples of generation with the same parameters : bc79ba9f-270d-4cc5-b01c-3a12b579f0b6.pdf e0de83e0-1219-497c-894b-18936aa622a5.pdf

Regards,

Pierrick

andreasrosdal commented 6 years ago

These are the examples we have: https://github.com/LibrePDF/OpenPDF/tree/master/pdf-toolbox/src/test/java/com/lowagie/examples

mkl-public commented 6 years ago

@pvandenbroucke: Depending of parameters, the class PdfSignatureAppearance may generate additional things (like text from the certificates,...). Do you have any sample which creates visual representations ?

If you want iText to layout the information in the signature appearance, you can supply

If you want to layout it yourself, you can retrieve a PdfTemplate using getLayer(2) and fill it as you want.

Lonzak commented 6 years ago

We do it like this, we only want to show an image (of a visual handwritten signature):

byte[] signatureImage; // some image
PdfSignatureAppearance sigApp = stamper.getSignatureAppearance();
...
sigApp.setCrypto(this.privateKey, this.chain, null, PdfSignatureAppearance.SELF_SIGNED);

sigApp.setAcro6Layers(true);
sigApp.setLayer4Text("");
sigApp.setLayer2Text("");

//No question mark should appear -> overwrite layer 1
PdfTemplate t = sigApp.getLayer(1);
t.setBoundingBox(new Rectangle(100, 100));
t.setLiteral("% DSBlank\n");
...                 
sigApp.setImage(Image.getInstance(signatureImage));
// =0 : image will fully fill the rectangle
// <0 : image will fill the rectangle but will keep the proportions
// >0 : that scaling will be applied. 
// In any of the cases the image will always be centered. It's zero by default. 
sigApp.setImageScale(-1f);
pvandenbroucke commented 6 years ago

Thanks for the replies.

@andreasrosdal : The examples don't use the PdfSignatureAppearance. Do you want I add some unit tests to ensure that previous modifications work ? FYI, I'm also able to reproduce the last issue with a minimal code. That's only occurs when I call setVisibleSignature().

@mkl-public : Thanks, I will try to do by myself with the method getLayer(2).

@Lonzak : I'd like to propose several cases : text only, image only, combination of both (text on left/right,...). Thanks for the code and tips about scale

Lonzak commented 6 years ago

Sounds good to me!

andreasrosdal commented 6 years ago

Depending of parameters, the class PdfSignatureAppearance may generate additional things (like text from the certificates,...)

@pvandenbroucke Do you have more details about this?

pvandenbroucke commented 6 years ago

@andreasrosdal Thanks for fixing the objects ordering.

The class PdfSignatureAppearance contains some methods which do a lot of things (like getAppearance()) and reduce the final possibilities.

If you use DSS with PDFBox, visible signatures can be composed by :

It also handles the rotation, the ratio between pdf/image resolutions and the horizontal/vertical alignement of the text/image.

If the code can be improved in OpenPDF, it could be welcomed. Unfortunately, I don't have yet enough knowledge with this framework and I sometimes have difficulties to understand how the different elements interact each other.

The class PdfSignatureAppearance, if I understood well, could be used with two more or less distinct ways :

In DSS, I work with the second way (the CMS/CAdES is handled in a separated module) and sometimes the crypto elements are required to build the signature appearance.

I think this class could be splitted in several parts where you plug the strategy to follow and the required parameters. I can imagine that represents a lot of work.

What do you think about it ?

andreasrosdal commented 6 years ago

What do you think about it ?

I think the current functionality is good enough for a first version. It should be possible to sign the PDF files now, right. I can create another OpenPDF release, and then we can get feedback from the users. (release early, release often, as is common in open source).

Are there any critical blockers in OpenPDF now?

mkl-public commented 6 years ago

@pvandenbroucke: I think this class could be splitted in several parts where you plug the strategy to follow and the required parameters. I can imagine that represents a lot of work.

As one of the original objectives of OpenPDF was to serve as a maintained drop-in replacement for iText 2.1.7, I doubt existing architectures are going to be torn asunder.

If you look at how the original iText signature API developed (in particular in the 5.3.x versions), you'll see that they in particular introduced static helper methods of a MakeSignature class which model different signing use cases.

A similar set of methods (or even helper classes) could be introduced in OpenPDF to model the different use cases you see.

andreasrosdal commented 6 years ago

OpenPDF 1.2.2 has been released: https://github.com/LibrePDF/OpenPDF/releases/tag/1.2.2

If you look at how the original iText signature API developed (in particular in the 5.3.x versions), you'll see that they in particular introduced static helper methods of a MakeSignature class which model different signing use cases. A similar set of methods (or even helper classes) could be introduced in OpenPDF to model the different use cases you see.

Pull-requests for this is welcome! Otherwise, the current signing functionality will have to be sufficient.

srbala commented 5 years ago

Project https://github.com/nuxeo/nuxeo-signature already using openpdf, it's LGPL. Is there anything from that project can be used here?

andreasrosdal commented 5 years ago

Search for OpenPDF on the DSS ec.europa.eu page for more info: https://ec.europa.eu/cefdigital/DSS/webapp-demo/doc/dss-documentation.html

JohnPlanetary commented 3 years ago

Please considered integrating the ETSI EN 319 142-1 V1.1.0 (2016-02) (PAdES digital signatures) ( https://www.etsi.org/deliver/etsi_en/319100_319199/31914201/01.01.00_30/en_31914201v010100v.pdf ) B-LTA level.

6 6.1. Signature levels d) B-LTA level provides requirements for the incorporation of electronic time-stamps that allow validation of the signature long time after its generation. This level aims to tackle the long term availability and integrity of the validation material.

Not sure, but I think these can be done integrating the following code into the OpenPDF: https://github.com/esig/dss/tree/master/dss-pades-openpdf I hope it is possible to bring OpenPDF up to date, and provide these great and useful functionality.