vbuch / node-signpdf

Simple signing of PDFs in node.
MIT License
688 stars 175 forks source link

NEW Decouple signing logic #116

Closed dhensby closed 2 years ago

dhensby commented 3 years ago

Currently I have a need to sign PDFs using an third party service (Azure KeyVault). This means the private key is not locally accessible and instead the signing is done asynchronously "off site".

This means I can't provide a p12 certificate with a private key, so I need to provide my own signing logic which is also async.

As such this change introduces two major changes:

  1. Signing is now an async task to allow signers to be async
  2. Signers are now decoupled from the signing preparation process and a signed using a Signer subclass

Low level changes:


I appreciate this change is quite opinionated and happy to take on feedback to make this acceptable:

eg:

  1. Signers are class based, really they could just be a callable - I mainly did this to allow specific instanceof calls on the signer argument
  2. Signing is now async (though this is a requirement for my usage, so can't change this and it be "useful" for me)
coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 349


Totals Coverage Status
Change from base Build 344: 0.0%
Covered Lines: 316
Relevant Lines: 316

💛 - Coveralls
vbuch commented 3 years ago

Nice! I'm perfectly fine with API changes as long as they make sense.

I don't have the time to look at it right now.

What I imagine will be perfect is continuing/restarting the work on the lerna setup so that separate things are separately installable. E.g. npm i node-signpdf node-signpdf-p12 node-signpdf-helper-plain Will give your code an eye and write back. Thank you!

dhensby commented 3 years ago

Cool - as it currently is I'll be using my fork. I'm happy to refactor to work this out so it can be merged or integrated into another package.

Having the actual signing "drivers" as separate plugins would be cool, but whilst there's only 1 supported it might be over-engineering but I'll leave that decision to you

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had activity in the past 90 days. It will be closed if no further activity occurs. Thank you for your contributions.

dhensby commented 3 years ago

planning to get onto this soon

dhensby commented 3 years ago

Hey @vbuch - I'm at a point where I'm able to put some time into this to get it mergable. Would you be able to take a look and let me know if there's anything I can do to help get this in?

I'll author some changes against the lerna branch and open a separate PR so you can have a look at that too. However, what's the timeline for that version being published/stable?

vbuch commented 3 years ago

Wow! Great news!

  1. The lerna start I've made in a branch long ago is probably very outdated and catching up won't be possible. It stays there just so that I have somewhere to look at when I get the time to work on the project next time.
  2. There is no specific timeline. I release whenever we've confirmed it's stable and usable. We have pretty good coverage of test cases, so I believe, if we keep all/ (as much as possible) tests, we will be good to go.
  3. I am really willing to help in such an endeavor but I am affraid I won't have much time. Still, I'll give my best.
  4. I need to catch up/remember where this here was left. I'll try to do this today.
dhensby commented 3 years ago

Hi @vbuch - thanks for the reply.

Yes, I had a look at the lerna branch and got a bit scared off because I wasn't sure how best to implement this work there.

I'm also thinking the implementation of this may be a bit much when really all we need to supply is a callable to do the actual crypto signing and the ability for that callable to be async/promise-based.

It would be nice for the signers to be separate packages so that forge didn't need to be included if it wasn't needed, but that feels like a less essential refactor.

dhensby commented 3 years ago

On another note, I'm currently working on doing PAdES-LTV level signing of documents; there are likely to be some changes I need to make to the library to comply with this. If the work is something I can contribute back to the library I'll open a PR for that.

There will definitely be some minor tweaks needed at the least to make it possible.

vbuch commented 3 years ago

Checked the code here. Seems mergable. But I don't want to release this unless stuff is separated in packages. So, yes, lerna setup becomes mandatory for me. Also checked the lerna branch and possible rebase... Impossible rebase. Can't promise anything, but most def willing to get that done...

rossinicolas commented 3 years ago

@rossinicolas

1. I am not using `plainAddPlaceholder` but I have noticed when using it for a small test that it did not appear to work correctly - this probably should be raised in the upstream library

2. If you're getting the error `pdf.initForm is not a function` then I'd assume you're not using pdfkit v0.11 and you're using v0.10.

yes, i understood that but in our case we need use plainAddPlaceHolder because the pdf file its not made by pdfkit or in some case we need sign a third part pdf file. do you have any idea when @vbuch merge this branch to the official code? Thanks in advance.

masterviana commented 2 years ago

@dhensby I'm working on a case where I've the cerificate but the sign process is done by a third party service that hold my private key. The goal is produce a PAdES compliant PDF file. I'm looking to this repo and try to mod to fix my needs. Do you have wrote some test file that work for your case?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had activity in the past 90 days. It will be closed if no further activity occurs. Thank you for your contributions.

dhensby commented 1 year ago

@vbuch - I'd like to revist this at some point (enabling external signers). Do you have any guidance on whether you're still hoping to get the lerna setup completed and wanting to pull this kind of thing into separate packages?

vbuch commented 1 year ago

@dhensby wanting - totally. Actually doing it - really hard to find the time.

vbuch commented 10 months ago

@dhensby it is mostly done. https://github.com/vbuch/node-signpdf/tree/feature/split yarn workspaces + Lerna Not saying it's perfect but it passes linting, testing and building of the separate packages.

vbuch commented 10 months ago

@dhensby https://github.com/vbuch/node-signpdf/pull/195 introduces this change. Merging directly wasn't possible because of the new structure of things but the end goal is achieved. Would be very helpful if you could give it a look. Thanks in advance.

dhensby commented 10 months ago

I'll definitely look to find some time to rework this

cw6365 commented 7 months ago

On another note, I'm currently working on doing PAdES-LTV level signing of documents; there are likely to be some changes I need to make to the library to comply with this. If the work is something I can contribute back to the library I'll open a PR for that.

There will definitely be some minor tweaks needed at the least to make it possible.

@dhensby did you make any progress with this? Do you have any sample code? Thanks.

dhensby commented 7 months ago

@cw6365 I did, yes - I ended up using https://github.com/PeculiarVentures/asn1-schema/tree/master/packages/cms and https://github.com/PeculiarVentures/asn1-schema/tree/master/packages/ocsp to manually construct and sign the signatures and I implemented my own signer to achieve this.

Unfortunately it's not really a "one size fits all" implementation, and it's also not perfect because the signatures aren't actually being verified by Adobe Acrobat... I've struggled to truly understand the PDF spec in this regard as the CMS signature does appear to comply with the LTV spec, but I believe PDFs need to be signed twice for the LTV signature to be recognised - and I pretty much gave up at that point 😓

The main aspect is that the OCSP/CRL response needs to be embedded in the signature, and fetching that from the CA is not something I've implemented in a particularly generic way...

cw6365 commented 7 months ago

Thanks for responding @dhensby. From what i have read so far.. do i have the right flow in terms of adding a timestamp and a OCSP response..

If you could let me know if i'm on the write track please.. i'll look at coding it in.

Thanks. CW

sagge commented 7 months ago

I have actually managed to get Adobe Acrobat to recognise and verify the signatures and show "Signature is LTV enabled" with a heavily modified version of this library, forked version of node-forge and some additional libraries like jsrsasign while using AWS KMS to do the actual signature creation (private key residing in HSM in AWS in order to get properly recognised certificate).

Screenshot 2024-01-22 at 17 53 05

I forked all the libraries over two years ago and they have developed quite a bit further, but trying to share some of the findings if others would find it useful. At that point node-signpdf didn't support visual signature widgets and signing with external HSM wasn't in the horizon, so the changes needed to get all this working was not possible to do through PRs given that there was a time limit on my side to get this working. It would be nice at some point to start with the current state of signpdf and see how easily this could be incorporated into the library. In the meantime, I hope that someone can find some inspiration from the changes I had to make.

First of all, Forge needs some modifications to support this properly and the lib/pkcs7.js needs to be modified in particular to enable construction of correct data format. If I recall everything correctly, the following are needed at least for forge:

signatureAlgorithm: forge.pki.oids.rsaEncryption -> signatureAlgorithm: forge.pki.oids.sha256WithRSAEncryption
unauthenticatedAttributes: [] -> unauthenticatedAttributes: signer.unauthenticatedAttributes || [],

Calling the p7.addSigner inside signer.sign() in node-signpdf would look like this after Forge has the support needed:

p7.addSigner({
  key: privateKey,
  certificate: signerCert,
  issuer: signerCert.issuer,
  serialNumber: signerCert.serialNumber,
  digestAlgorithm: forge.pki.oids.sha256,
  authenticatedAttributes: [
    // ORDER MATTERS!
    {
      type: forge.pki.oids.contentType,
      value: forge.pki.oids.data,
    },
    {
      type: forge.pki.oids.messageDigest,
      // value will be auto-populated at signing time
    },
    {
      type: forge.pki.oids.essSigningCertificateV2,
      value: essSigningCertAsn1,
    },
    {
      type: forge.pki.oids.revocationInfoArchival,
      value: revocationInfoArchivalAsn1,
    },
  ],
  unauthenticatedAttributes: [
    {
      type: forge.pki.oids.timeStampToken, // "1.2.840.113549.1.9.16.2.14"
      value: '',
    },
  ],
});

And the needed data would look like this:

const privateKey = new AwsKmsPrivateKey(<Alias_of_your_private_key_in_KMS>);

// Here comes the actual PKCS#7 signing.
const p7 = forge.pkcs7.createSignedData();
// Start off by setting the content.
p7.content = forge.util.createBuffer(pdf.toString('binary'));

p7.addCertificate(intermediateCertPem);
p7.addCertificate(certificatePem);

const jsrsasignEssSigningCert = new jsrsasign.asn1.cms.ESSCertIDv2(certificatePem);
const jsrsasignCertHex = jsrsasignEssSigningCert.getEncodedHex();
const essSigningCertAsn1 = forge.asn1.create(
  forge.asn1.Class.UNIVERSAL,
  forge.asn1.Type.SEQUENCE,
  true,
  [forge.asn1.fromDer(forge.util.hexToBytes(jsrsasignCertHex))]
);

const signerOcspResponseAsn1 = forge.asn1.fromDer(ocspResponseSignerCert.toString('binary'));
const revocationInfoArchivalAsn1 = forge.asn1.create(
  forge.asn1.Class.UNIVERSAL,
  forge.asn1.Type.SEQUENCE,
  true,
  [
    forge.asn1.create(forge.asn1.Class.CONTEXT_SPECIFIC, 1, true, [
      forge.asn1.create(forge.asn1.Class.UNIVERSAL, forge.asn1.Type.SEQUENCE, true, [
        // Signer OCSP
        signerOcspResponseAsn1,
      ]),
    ]),
  ]
);

const signerCert = forge.pki.certificateFromPem(certificatePem);

The OCSP check needs to be performed for the certificate(s) that signed your signing certificate until you reach a certificate that is recognised by either Adobe AATL list or European Union Trust Lists EUTL. In my case I needed to check the intermediate certificate and implemented the checking using a check function found in here: https://github.com/indutny/ocsp/blob/master/lib/ocsp/check.js and modified it to just return the raw response instead of checking it:

...
utils.getResponse(uri, req.data, (error, raw) => {
  if (error) return done(error);
  done(null, raw);
});
...

And then just calling it like this before adding the placeholder:

const ocspPromise = ({ cert, issuer }) => {
  return new Promise((resolve, reject) => {
    check(
      {
        cert,
        issuer,
      },
      (err, res) => {
        if (err) reject(err);
        resolve(res);
      }
    );
  });
};

const ocspResponseSignerCert = await ocspPromise({
  cert: certificatePem,
  issuer: intermediateCertPem,
});

The DSS needs to be added before signing the document and in my case it was enough to include the signing cert, intermediate cert and OCSP responses for the intermediate cert. I used the plainAddPlaceholder function for this and extended it to take care of adding the DSS to the appropriate location. My plainAddPlaceholder looks like this:

const plainAddPlaceholder = async ({
  pdfBuffer,
  reason,
  contactInfo,
  name,
  signatureLength,
  subFilter,
  signerIdx,
  signerPerson,
  signatureData,
  signingTime,
  certificatePem,
  intermediateCertPem,
  ocspResponseSignerCert,
}: AddPlaceholderOptions) => {
  const pdfKitMock = new PdfKitMock(pdfBuffer, signerPerson.pageNb);

  const { form, widget } = await pdfkitAddPlaceholder({
    pdf: pdfKitMock,
    pdfBuffer,
    reason,
    contactInfo,
    name,
    signatureLength,
    subFilter,
    signerIdx,
    signerPerson,
    signatureData,
    signingTime,
  });

  const signingCert = forge.pki.certificateFromPem(certificatePem);
  const intermediateCert = forge.pki.certificateFromPem(intermediateCertPem);

  const signingCertStream = Buffer.from(
    forge.asn1.toDer(forge.pki.certificateToAsn1(signingCert)).getBytes(),
    'binary'
  );
  const intermediateCertStream = Buffer.from(
    forge.asn1.toDer(forge.pki.certificateToAsn1(intermediateCert)).getBytes(),
    'binary'
  );
  const ocspAiaStream = Buffer.from(ocspResponseSignerCert, 'binary');

  const dssRefIndex = pdfKitMock.info.xref.maxIndex + 1;
  const signingCertRefIndex = pdfKitMock.info.xref.maxIndex + 2;
  const intermediateCertRefIndex = pdfKitMock.info.xref.maxIndex + 3;
  const ocspAiaRefIndex = pdfKitMock.info.xref.maxIndex + 4;

  let lastDssDictString;

  if (!isContainBufferRootWithAcroform(pdfKitMock.pdf)) {
    const rootIndex = getIndexFromRef(pdfKitMock.info.xref, pdfKitMock.info.rootRef);
    pdfKitMock.addedReferences.set(rootIndex, pdfKitMock.pdf.length + 1);
    pdfKitMock.pdf = Buffer.concat([
      pdfKitMock.pdf,
      Buffer.from('\n'),
      createBufferRootWithAcroform(pdfKitMock.info, form, `${dssRefIndex} 0 R`),
    ]);
  } else {
    const lastDssDictRef = getLastDssDictRef(pdfKitMock.info);
    lastDssDictString = findObject(pdfKitMock.pdf, pdfKitMock.info.xref, lastDssDictRef).toString();
    pdfKitMock.pdf = Buffer.concat([
      pdfKitMock.pdf,
      Buffer.from('\n'),
      addReferenceToRootWithAcroform(pdfKitMock.info, `${dssRefIndex} 0 R`),
    ]);
  }

  if (lastDssDictString) {
    const previousCerts = extractCertReferences(lastDssDictString);
    const previousOcsps = extractOCSPsReferences(lastDssDictString);
    pdfKitMock.ref(
      {
        // Type: 'DSS',
        Certs: [
          ...previousCerts,
          new PDFKitReferenceMock(signingCertRefIndex),
          new PDFKitReferenceMock(intermediateCertRefIndex),
        ],
        OCSPs: [...previousOcsps, new PDFKitReferenceMock(ocspAiaRefIndex)],
      },
      dssRefIndex
    );
  } else {
    pdfKitMock.ref(
      {
        // Type: 'DSS',
        Certs: [
          new PDFKitReferenceMock(signingCertRefIndex),
          new PDFKitReferenceMock(intermediateCertRefIndex),
        ],
        OCSPs: [new PDFKitReferenceMock(ocspAiaRefIndex)],
      },
      dssRefIndex
    );
  }

  pdfKitMock.ref(
    {
      Length: signingCertStream.length,
      stream: signingCertStream,
    },
    signingCertRefIndex
  );

  pdfKitMock.ref(
    {
      Length: intermediateCertStream.length,
      stream: intermediateCertStream,
    },
    intermediateCertRefIndex
  );

  pdfKitMock.ref(
    {
      Length: ocspAiaStream.length,
      stream: ocspAiaStream,
    },
    ocspAiaRefIndex
  );

  // For the future: PAdES-LTV requires this, not sure if this is the only thig needed. Space for the content has to be
  // reserved before signing and poupulated during the signing process similar to signature content itself.
  // const docTimeStamp = pdfKitMock.ref({
  //   Type: 'DocTimeStamp',
  //   SubFilter: 'ETSI.RFC3161',
  // };

  pdfKitMock.addedReferences.set(pdfKitMock.pageIndex, pdfKitMock.pdf.length + 1);
  pdfKitMock.pdf = Buffer.concat([
    pdfKitMock.pdf,
    Buffer.from('\n'),
    createBufferPageWithAnnotation(pdfKitMock.pdf, pdfKitMock.info, pdfKitMock.pageRef, widget),
  ]);

  pdfKitMock.pdf = Buffer.concat([
    pdfKitMock.pdf,
    Buffer.from('\n'),
    createBufferTrailer(pdfKitMock.pdf, pdfKitMock.info, pdfKitMock.addedReferences),
  ]);

  return pdfKitMock.pdf;
};

With the helper functions looking like this:

const isContainBufferRootWithAcroform = (pdf: Buffer) => {
  const bufferRootWithAcroformRefRegex = /\/AcroForm\s+(\d+\s\d+\sR)/g;
  const match = bufferRootWithAcroformRefRegex.exec(pdf.toString());

  return match != null && match[1] != null && match[1] !== '';
};

export const createBufferRootWithAcroform = (
  info: Info,
  form: string | PDFKitReferenceMock,
  dss: string
) => {
  const rootIndex = getIndexFromRef(info.xref, info.rootRef);

  return Buffer.concat([
    Buffer.from(`${rootIndex} 0 obj\n`),
    Buffer.from('<<'),
    Buffer.from(`${info.root}`),
    Buffer.from(`/AcroForm ${form}\n`),
    Buffer.from(`/DSS ${dss}`),
    Buffer.from('\n>>\nendobj\n'),
  ]);
};

export const addReferenceToRootWithAcroform = (info: Info, dss: string) => {
  const rootIndex = getIndexFromRef(info.xref, info.rootRef);
  const rootStrippedFromDss = info.root.replace(/\/DSS \d+\s+\d+\s+R\n/g, '');

  return Buffer.concat([
    Buffer.from(`${rootIndex} 0 obj\n`),
    Buffer.from('<<'),
    Buffer.from(`${rootStrippedFromDss}`),
    Buffer.from(`/DSS ${dss}\n`),
    Buffer.from('>>\nendobj\n'),
  ]);
};

Calling of the plainAddPlaceholder looks like this:

const tmpBinary = await plainAddPlaceholder({
  pdfBuffer: binary,
  reason: 'Your reason',
  signatureLength,
  contactInfo: 'Contact Info',
  name: 'Desired name',
  subFilter: SUBFILTER_ETSI_CADES_DETACHED,
  signerIdx,
  signerPerson,
  signatureData,
  signingTime,
  certificatePem,
  intermediateCertPem,
  ocspResponseSignerCert,
});

After calling the p7.sign({ detached: true }); and before adding the signature to the document, we still need to add the TSA response as an unauthenticatedAttribute to the p7 object:

const forgeSignature = p7.signers[0].signature;

const timeStampToken = await tsa({
  tsaUrl: <url_of_the_tsa>,
  signature: forgeSignature,
});

if (timeStampToken) {
  // Populate timeStampToken in unauthenticated attribute
  // TODO: Might be a better way to do this, something else than indexes
  p7.signerInfos[0].value[6].value[0].value[1] = forge.asn1.create(
    forge.asn1.Class.UNIVERSAL,
    forge.asn1.Type.SET,
    true,
    [timeStampToken]
  );
}

TSA request can be made as follows:

const tsa = async ({ tsaUrl, signature }) => {
  // Generate SHA256 hash from signature content for TSA
  const md = forge.md.sha256.create();
  md.update(signature);
  const digest = md.digest().getBytes();

  const request = new jsrsasign.asn1.tsp.TimeStampReq({
    messageImprint: { alg: 'sha256', hash: Buffer.from(digest, 'binary').toString('hex') },
    certreq: true,
  });
  const requestHex = request.getEncodedHex();
  const tsr = forge.util.hexToBytes(requestHex);

  try {
    const response = await axios({
      method: 'post',
      url: tsaUrl,
      data: Buffer.from(tsr, 'binary'),
      headers: {
        'Content-Type': 'application/timestamp-query',
      },
      responseType: 'arraybuffer',
      responseEncoding: 'binary',
    });
    const responseAsn1 = forge.asn1.fromDer(response.data.toString('binary'));

    // Return the token (it contains cert data)
    return responseAsn1.value[1];
  } catch (error) {
    return undefined;
  }
};

I'm sure there is still something that I forgot, but hopefully this will help to get some kind of clue on what needs to be done in order to get the library to produce signatures that Adobe recognises as LTV enabled. And huge thanks for @dhensby for doing the heavy lifting on the initial PR and on the topic in general on producing highly compliant signatures with these libraries.

For reference, I found the ETSI org signature conformance checker to be really useful for debugging this. It requires registration, but is free to use: https://signatures-conformance-checker.etsi.org/pub/index.php

You can also find some free TSAs with a bit of googling, but here's a list that I found useful: https://gist.github.com/Manouchehri/fd754e402d98430243455713efada710

cw6365 commented 7 months ago

Fantastic, many thanks @sagge