Provide default sign method, add external signer and webcrypto example.

vbuch / node-signpdf

Simple signing of PDFs in node.

MIT License

711 stars 178 forks source link

Provide default sign method, add external signer and webcrypto example. #220

Closed dcbr closed 7 months ago

dcbr commented 9 months ago

This PR adds a new ExternalSigner abstract class that can be used to simplify the signing of pdfs using an external service (e.g. azure keyvaults or smartcards). An example implementation is also provided for signing with the WebCrypto API and it was also tested with Belgian eID (identity card; code for this not included in this PR).

Details

The abstract Signer class is modified and now provides a default implementation for the sign function. It uses the PKI.js library to construct the CMS signed data structure expected by a PKKLite pkcs7.detached signature. The benefit of this library over node-forge (previously used for this purpose in P12Signer) is that it supports asynchronous signing, which is necessary when communicating with external services. It also leverages the WebCrypto API under the hood. Subclasses now only have to implement at least the getCertificate() (to retrieve the signing certificate) and getKey() (to retrieve the private key used for signing) methods. Furthermore, the used signing or hashing algorithms can be overridden and the used crypto object can be modified as well.
A new ExternalSigner abstract class is added to the @signpdf/utils package, which supports generating signatures using an external signature provider. To use this class, users need to subclass it and implement at least the getCertificate() (to retrieve the signing certificate from the external service) and getSignature(hash, data) (to retrieve the signature of the given hash from the external service) methods. Similar to the Signer base class, the used signing or hashing algorithms can be overridden and the used crypto object can be modified as well.
New webcrypto.js and webcrypto-external.js example scripts are added that show an example implementation of the Signer and ExternalSigner classes respectively. Both use the WebCrypto API, which supports signing with RSA and ECDSA.

vbuch commented 9 months ago

Just reading through the description for now. I think @dhensby's opinion will be of much value here. The way I imagined this was with the Abstract implemention staying in utils and then having signers like signer-webcrypto. I'm not saying it's better this way or the other. Just sharing what my thought was before diving into actually implementing anything. I'll find the time to have a deeper look here soon.

dcbr commented 9 months ago

Alright thanks for your initial thoughts already! It might make more sense to keep the abstract classes in the utility package indeed. Maybe the signer-external from this PR could replace the base Signer class altogether then, as I'm fairly sure the current implementation of P12Signer can be altered to 'fit' as a subclass of ExternalSigner (but I haven't tried or tested this).

dcbr commented 9 months ago

So indeed the existing P12Signer can be implemented as a subclass of the proposed ExternalSigner class (after some minor changes). My suggestion would thus be to make the modified ExternalSigner class the new abstract Signer class. Subclasses would then only have to implement a getCertificate and getSignature method, while the creation of the CMS signed data structure is dealt with in the super class. Would this be ok for you or are there other existing Signer subclasses I should be aware of? Any other suggestions or comments are of course still welcome, before I make these changes (will convert this to a draft for now).

dcbr commented 9 months ago

I restructured the commits a bit and made some small final changes, this is ready for review now.

dcbr commented 8 months ago

Ok, thanks for your comments and further clarifications. I definitely understand your concern for adding these extra dependencies and if that's the policy of this repository, the code should be restructured accordingly.

For what it's worth I will share my personal opinion here (from a user's perspective), but of course I haven't as much experience with it as I only discovered this library a few months ago. I think as a user, I shouldn't really be bothered by how the signature is generated and put into the PDF (as a CMS signed data structure) and which dependencies are necessary for that. If I want to use a P12 certificate, it would be nice to just install the @signpdf/signer-p12 package (with all necessary dependencies, such as node-forge) and use it as shown in the examples (passing the certificate buffer). Similarly, if I have access to an external signature provider (such as azure keyvaults), it would be nice to just install/use the SignerExternal class proposed here (with all necessary dependencies, such as pkijs) and use it as shown in the examples (i.e. subclassing the abstract SignerExternal class and linking it with the external signature provider by providing the getCertificate and getSignature methods).

Of course we can create separate signers for each service, i.e. a signer-azure (for azure keyvaults), signer-webeid (library I used to create signatures with EU identity cards) or signer-webcrypto (as in the examples). But ultimately they will all share most of their code (the creation of the CMS signed data structure) with only minor differences (the way the certificate is retrieved and the signature of the hashed signed attributes is obtained), so that's why I thought it would be nice to have a generic Signer (or ExternalSigner) class which already deals with that common logic. When such a generic Signer implementation is provided, it's also much easier to later on extend it to support e.g. PAdES B-T or B-LTA compliant signatures for all subclassed signers (which would otherwise have to individually support this). Maybe to avoid the extra dependencies from entering the utils package, a separate @signpdf/signer (or signer-base) package could be created, which implements this common logic, while keeping the Signer class in the utils package as minimal as possible (i.e. as it was before this PR)?

dhensby commented 8 months ago

Hey, sorry for the delay in looking at this.

Firstly, great work and effort, @dcbr! Looks like you've put a good deal of time and thought into this, so thanks for that.

I do share @vbuch's concerns about dependencies and the potential onward bloat that may be caused by this... However, I do wonder if there also needs to be a slight mental shift in what the library takes ownership of.

At the moment the library is essentially, sign with forge and a P12 cert or... do it all yourself. A lot of the code here is similar to what I'm doing in my azure signer that I run in production, the entire CMS object needs to be constructed in the application and then signed in the KeyVault... That adds a lot of complication because there needs to be an understanding of how to construct the CMS object, encode it and so on. At the moment this library doesn't really need to concern itself with that because it essentially hands that responsibility off to forge to do (though the forge implementation is not perfect, and forge is seeing less and less maintenance effort).

As this addition shows, if you want to write any other kind of signer, you have to implement all the low-level logic for signature creation, not just signing. In that sense, the Signer class is a slight misnomer because it's doing a lot more than just the cryptographic step of signing data.

I do like this approach because it allows us to take control of the process and really provide the "plug-ability" of the consumer to just provide the cryptographic engine (ie: webcrypto, an external KMS/HSM, etc). But it does mean that there is quite a large increase in scope and responsibility of the library, especially to make it extensible in a way that is actually useful. At the moment, this PR has a fairly rigid CMS structure in place that doesn't really have any extensibility; for example, I'm creating LTV CMS signatures in my project and that requires embedding the OCSP response in the CRL info to the CMS object - that's not possible the way this PR is written at the moment... That of course opens the question of whether we even want it to be that flexible.

My assumption on it when I first looked at making the signing step abstract (so being able to replace the forge signing with my own) was that if you don't want forge, then you go it alone - this lib isn't here to do the CMS construction. Is that a valid assumption or not? Are the libs that are more flexible that can just be deferred to for this? Is what is in this PR something we expect consumers to do for themselves if they need that level of flexibility?

I'm not going to answer those questions, that's more for @vbuch to decide on, I think... But what I would say is that whilst I like this approach, it wouldn't actually do anything for my personal use case but it would increase the maintenance burden of the package.

dcbr commented 8 months ago

Thanks for your comments @dhensby, I entirely agree with your point of view.

At the moment, this PR has a fairly rigid CMS structure in place that doesn't really have any extensibility; for example, I'm creating LTV CMS signatures in my project and that requires embedding the OCSP response in the CRL info to the CMS object - that's not possible the way this PR is written at the moment... That of course opens the question of whether we even want it to be that flexible.

Regarding this point, I feel like this PR is a first step towards a more flexible implementation of the CMS generation. For example, I have been able to create PAdES B-B and B-T compliant signatures with it, with minimal extra code additions (just adding the right signed or unsigned attributes). I think it would be nice if this library supported different such signature types and your implementation of LTV CMS signatures would be a nice addition then as well (haven't delved into that myself yet, but it also comes down to adding extra (un)signed attributes IIRC?). Of course it increases the scope of this package, but I think there's quite some demand for these things by looking at the various issues I encountered while working on this (external: #15, #40, #46, #142, #175; PAdES: #68, #71, #124, #149, #183).

dcbr commented 8 months ago

I moved the base Signer and ExternalSigner implementations out of the utils package into a new signer package, to keep the extra dependencies out. The CMS generation is now also a bit more flexible, allowing subclasses to provide their own signed and unsigned attributes, depending on their needs.

dhensby commented 8 months ago

The main thing that determines if we take this forward is if @vbuch is happy to increase the scope of the package in this way or if it's better to continue to rely on external implementations (like node-forge) to produce the signatures.

vbuch commented 8 months ago

Sorry for being a bit slow here.

It's not about me being happy with increasing the scope. If the scope is large more people will need to take care of it :)

I think from the point of view of PDF signing, all you care about is having an async sign() method. No further abstactions needed. On the other hand I see the value in ExternalSigner that gives you the structure but allows you to work further with specifics. With that said does it make sense to the old abstract Signer that only has sign() with zero implementation and then have another abstract PkiSigner that does the rest?

Just thinking here.

The idea was always to just show how it can be done. The more readable (least abstract) the code - the better. I'm not sure we've kept it readable enough through the years to be honest. But that was the aim.

pki has moved into /signer as a dep, but /signer is a dep of signer-p12 so effectively p12 still requires pki.

I don't know really. Your words all make sense and I may be protecting some values that I shouldn't...

dcbr commented 8 months ago

It's not about me being happy with increasing the scope. If the scope is large more people will need to take care of it :)

If this is worrying you, I can obviously help to maintain this part of the project if you want. No hard guarantees on responsiveness (I have my slower and faster periods), but I would be glad to help in resolving issues, reviewing PRs or further contributing to these abstract base classes.

With that said does it make sense to the old abstract Signer that only has sign() with zero implementation and then have another abstract PkiSigner that does the rest?

I think it does make sense, taking into account the extra dependencies the base implementation brings with it. If people want to avoid these dependencies, they can just build the CMS structure themselves (using their own code or alternative dependencies) by subclassing the "zero implementation"-Signer class in the utils package (maybe this class can be renamed to ISigner to make it clear there's no implementation at all, i.e. a barebone interface). Otherwise, if they don't want to be bothered by these implementational details (and just want to focus on the cryptographic signing step), they can just go ahead and subclass the "base implementation"-Signer class in the signer package (which of course then requires the extra dependencies as well).

The idea was always to just show how it can be done. The more readable (least abstract) the code - the better. I'm not sure we've kept it readable enough through the years to be honest. But that was the aim.

That's a very honorable aim and I hope I can rework this PR together with you to make sure we achieve that. I think the signing process can still remain readable, even with an extra abstract Signer class implementation. But it will be key to document everything well and provide sufficient examples. I would like to add an example using Azure Keyvaults and maybe even the web-eid implementation I have been using, but as this PR is already quite large I think it's best to keep that for follow-up PRs and first focus on finishing this initial proposal.

pki has moved into /signer as a dep, but /signer is a dep of signer-p12 so effectively p12 still requires pki.

Indeed, but it is now a peer dependency (just like node-forge). This is optional though, if you want I can revert this part and just keep signer-p12 as it was. I thought this would be useful though if we extend the capabilities of the base Signer implementation, e.g. to support signing with timestamps (PAdES-B-T) or with long term validation support (LTV CMS structures @dhensby was talking about).

vbuch commented 8 months ago

If this is worrying you, I can obviously help to maintain this part of the project if you want. No hard guarantees...

So that's settled.

I think it does make sense, taking into account the extra dependencies the base implementation brings with it. If people want to avoid these dependencies, they can just build the CMS structure themselves (using their own code or alternative dependencies) by subclassing the "zero implementation"-Signer class in the utils package (maybe this class can be renamed to ISigner to make it clear there's no implementation at all, i.e. a barebone interface). Otherwise, if they don't want to be bothered by these implementational details (and just want to focus on the cryptographic signing step), they can just go ahead and subclass the "base implementation"-Signer class in the signer package (which of course then requires the extra dependencies as well).

Sounds right.

Indeed, but it is now a peer dependency...

Yes. IMO keep the changes minimal and then, when there is need and time, refactors can happen.

dcbr commented 8 months ago

Alright, I'm wrapping up some other stuff and then I'll come back to this and do the necessary refactorings

dcbr commented 8 months ago

Alright, the conflicts are resolved and the P12Signer changes have been reverted. Ready for review.

vbuch commented 7 months ago

@dcbr sorry but I had to revert this in develop. Will try to fix it. See #230.