vbuch / node-signpdf

Simple signing of PDFs in node.
MIT License
676 stars 174 forks source link

Another attempt at "Provide default sign method, add external signer and webcrypto example" #230

Open vbuch opened 4 months ago

vbuch commented 4 months ago

The external dependencies are not mocked in the tests which may cause all kinds of problems on different platforms. E.g. tests pass on my Mac but fail on my Windows and in Github Actions. To make sure develop is still workable on I've reverted #220 and readded its commits on top of the reverted version.

We need mocks so that these commits can be merged in.

coveralls commented 4 months ago

Coverage Status

coverage: 99.638% (+0.002%) from 99.636% when pulling c11b6475754de9ce1191632f23e7f2dedb3dcaee on signer-external into 13b071e7640f15c69f5aee9a3c4af417d15cccba on develop.

dcbr commented 4 months ago

I don't really know what's going wrong, so I'll leave it up to you or do you need my help with this?

vbuch commented 4 months ago

I'll give it a go when I have the time.

dcbr commented 3 months ago

I cannot modify this PR branch, so I cloned it to my own repository and rebased onto the latest development branch. You can see the result here. All tests pass (after adding an extra one to improve coverage), and apparently there are also no issues on the windows build? @vbuch: Could you verify that this works correctly on your Windows environment and then go ahead with this PR?

TimKieu commented 3 months ago

@vbuch Let me bother you on run-time compatible of this signpdf: can this tool-kit run on edge run-time? Vercel or CloudFlare. Skimming on some example, this is nodejs run-time. Thanks much indeed.

dhensby commented 3 months ago

This may be better discussed in a dedicated issue, but I don't really like this approach as it's mixing the responsibilities of signature generation (the actual signing), with the CMS structure generation.

At the moment the signing interface is really simple and allows for any external signer to be used to generate signatures - all you need to do is effectively provide a single function that signs the provided data. see https://github.com/vbuch/node-signpdf/blob/v3.2.4/packages/utils/src/Signer.js

What this change is really doing is taking the CMS structure generation in-house (which is all good) but it's mixed-in with the signing logic. IMO we need to separate these concerns and provide a way to generate the CMS structure, and then a way to do the signing.

We can think of it a bit like a pipeline. First the PDF needs to be created (or loaded), then a CMS structure is created, then the CMS structure is signed, then the CMS is embedded in the PDF. The signing part of this pipeline shouldn't have to be concerned with anything other than a buffer/uint8array that it receives and then returns a buffer/uint8array that represents the signature.

dcbr commented 3 months ago

Regarding your last point, I'm not sure I fully understand what you mean.

I don't really like this approach as it's mixing the responsibilities of signature generation (the actual signing), with the CMS structure generation.

At the moment the signing interface is really simple and allows for any external signer to be used to generate signatures - all you need to do is effectively provide a single function that signs the provided data. see https://github.com/vbuch/node-signpdf/blob/v3.2.4/packages/utils/src/Signer.js

What this change is really doing is taking the CMS structure generation in-house (which is all good) but it's mixed-in with the signing logic. IMO we need to separate these concerns and provide a way to generate the CMS structure, and then a way to do the signing.

I would argue that the current situation is even worse/more coupled and that the implementation here is a first step towards decoupling it. For example, when subclassing the current Signer, you indeed have to generate the CMS structure and sign it yourself (or let this be done by another service/library). In this implementation, when subclassing Signer or ExternalSigner you should only take care of the cryptographic signing step and the CMS structure is automatically generated for you already.

We can think of it a bit like a pipeline. First the PDF needs to be created (or loaded), then a CMS structure is created, then the CMS structure is signed, then the CMS is embedded in the PDF. The signing part of this pipeline shouldn't have to be concerned with anything other than a buffer/uint8array that it receives and then returns a buffer/uint8array that represents the signature.

IMO that is exactly how the signing part is implemented in the new Signer and ExternalSigner classes (either by passing a uint8array signing key in the getKey method, or by signing a received uint8array in the getSignature method). Further separating it to obtain this pipeline procedure, would be even better of course. But I think the current PR is already an improvement and first step in that direction?

dhensby commented 3 months ago

I agree that this is an improvement on what we've currently got (which, as you rightly point out, is either a very rigid implementation from node-forge or you have to complete BYO CMS). My current implementation is the BYO CMS approach, I build and sign the entire signature and have that embedded as the node-forge implementation is not "fit for purpose", so I'm totally on-side with improving this.

It's really the architectural approach I don't agree with here. If we have "private" methods that we are implementing for the crypto aspect, then I'd argue they should not even be in the class at all.

If we want this "all-in-one" signer class, I feel it should probably work something like:

// Use the JWA spec for algs as a fairly well shared standard
type SigningAlg = 'RS256' | 'RS384' | 'RS512' | 'ES256' | 'ES384' | 'ES512' | 'PS256' | 'PS384' | 'PS512';

// we may want to add `crls` or `ocsp` info here?! or a way to fetch that info??
interface SignerOptions {
  certificate: string|UInt8Array, // the certificate being used for signing
  signer: (alg: SigningAlg; payload: UInt8Array) => Promise<UInt8Array>;
  certificateChain?: (string|UInt8Array)[]; // the certificate chain
  digestAlgorithm?: SigningAlg;
  authenticatedAttributes?: Record<string, string>; // provide a way to override / add authenticated attrs
  unauthenticatedAttributes?: Record<string, string>; // provide a way to override / add unauthenticated attrs
}

class Signer {
  constructor(opts: SignerOptions) {
    // just pseudo code, this would need to be validated, and stored against props properly
    this.opts = opts;
  }

  async sign(pdfBuffer: UInt8Array): Promise<UInt8Array> {
    // construct CMS encoded attrs for signing
    const sig = await this.opts.signer(algsignedAtrrs);
    // finish constructing CMS data
    return cmsData;
  }

This is obviously really simplified, it doesn't provide any way to support a .pfx/P12 certificate. It also makes a really hard assumption that there will only be one signer for the CMS, but there's obviously room in the spec for the CMS sig to have many signers...


After writing this out, I still don't like this approach. I think that we really need our own CMS interface that is in charge of creating the CMS structure, and then separately a Signer interface that can be passed to the CMS generator. The CMS structure isn't trivial, whilst it's easy to create a naive CMS signing interface, once you start drilling down into the actual spec, it really becomes quite complicated and there's a lot to be implemented (which I suspect is why this has been kept outside the remit of the library).

I will try to find some time to put something together...

dcbr commented 3 months ago

Ok, thanks for the further clarifications. This PR was my first attempt and idea of providing some extra functionality that is currently not supported, but I'm definitely open to reconsider the architecture for the better. My intent was to simplify the generation of PAdES compliant signatures, even up to the LTV level (requiring indeed some ocsp or crl data to be embedded in the CMS or pdf DSS). By extending the proposed Signature class of this PR I was able to produce such signatures. It was still in a prototype phase and too much changes to include in this PR (which is already quite large on its own), but it nicely "fit in" this class structure in my opinion. The idea was then that a user could pass a "signature configuration" structure when constructing a Signer instance (with some predefined defaults for e.g. "pades-b-b", "pades-b-t", "pades-b-ltv" type signatures). Unfortunately I haven't had much time since to further complete it (and I won't have in the coming months).

So if in the meantime you can come up with an alternative implementation, I'm all for it and very curious to see the result :)

Edit: Just found the "signature configuration" interface I was using (that can then be passed to the Signer constructor):

interface SignConfiguration {
    // Signature dictionary
    subFilter: string; // Subfilter of the signature dictionary
    // Signed attributes
    signingTimeAttr: boolean; // Flag indicating whether the `signing-time` signed attribute should be included in the signature
    signingCertAttr: boolean; // Flag indicating whether the `signing-certificate-V2` signed attribute should be included in the signature
    revocationInfoAttr: boolean; // Flag indicating whether the `adbe-revocationInfoArchival` signed attribute should be included in the signature
    // Unsigned attributes
    signatureTimestamp: boolean; // Flag indicating whether the signature should be timestamped (using the `signature-time-stamp` unsigned attribute)
    // Other
    dss: boolean; // Flag indicating whether the signature certificates and revocation information should be added to the DSS
    documentTimestamp: boolean; // Flag indicating whether the document should be timestamped
};
vbuch commented 3 months ago

Sorry. I'm really detached from the communication here. Just sharing an opinion I think I've shared before that is pretty shallow and maybe doesn't help: I prefer to have a minimal scope to maintain and maximum extensibility + modularity. So I imagine if you figure out a way to keep all the pkijs stuff in a separate package it would be the most awesome thing. Again, I didn't read the whole discussion you two have here and I don't think I will in the coming couple of months.