vbuch / node-signpdf

Simple signing of PDFs in node.
MIT License
696 stars 175 forks source link

Encrypted PDFs - blank pages #98

Closed waaronking closed 3 years ago

waaronking commented 3 years ago

Hi @vbuch! First of all, massive thanks for all the work you have done putting node-signpdf together. It works great!

I'm working on getting node-signpdf to properly attach signatures to password-protected (encrypted) PDFs. I think that might be beyond the original scope of this package, but if I can get a working solution, I'd love to contribute back to it.

Here are 3 basic tests (for historical purposes) and their results:

Test 1 - If I sign a PDF using node-signpdf and then password-protect it (using something like qpdf), the signature still shows up, but is invalid probably since the PDF has drastically changed since the signature was added. (Adobe Reader gives the following signature support information: SigDict /Contents illegal data)

Test 2 - If I (1) password-protect the PDF and then attempt to sign it, the PDF ends up being all blank pages after (2) adding the placeholder. I think adding the signature after the encryption may be corrupting the PDF format (resulting in the blank pages).

  1. Test2 - password.pdf
  2. Test2 - placeholder.pdf (password: pass123)

Test 3 - If I (1) add a placeholder to the PDF, (2) password-protect it, and then attempt to sign it, the ByteRange placeholder can't be found and adding the signature fails.

UnhandledPromiseRejectionWarning: Error: Could not find ByteRange placeholder: /ByteRange
  1. Test3 - placeholder.pdf
  2. Test3 - password.pdf (password: pass123)

Moving forward

I've attached PDFs in their corresponding creation order to help properly work through this. I know these tests are basic, but I'm still trying to understand the proper order for signing password-protected (encrypted) PDFs. Based on my understanding, it seems like the order in Test 3 might be the best way to move forward? If you open "Test3 - password.pdf" above, you can see that the password-protected PDF has a placeholder ready to sign.

I'm sure you're really busy @vbuch, but I was hoping to see if based on your understanding this feels like the right way forward, or if it seems like I'm way off base? Any thoughts you may have would be greatly appreciated! Thanks

waaronking commented 3 years ago

I figured it out. Test 3 from above is the proper workflow for signing password-protected PDFs. Some utilities such as QPDF do forms of linting on the PDF data resulting in the error I mentioned above. A simple change in the search string correctly found the placeholder and allowed node-signpdf to properly attach the signature.

Instead of searching for: /ByteRange [${byteRangePlaceholder.join(' ')}]

I had to search for: /ByteRange [ ${byteRangePlaceholder.join(' ')} ]

This probably is not PR worthy as different tools may add different forms of linting, but let me know @vbuch if you want me to add a PR for this or any other information to your examples regarding signing password-protected PDFs. Thanks again!

vbuch commented 3 years ago

This is a bug actually. ByteRange is an Array with 4 elements. Whitespace characters are allowed there (the beginning and/or end of an array). Also, the spec says "...PDF treats any sequence of consecutive white-space characters as if there were just one." in section 3.1.1 here. This would mean that this search is quite far from the spec.

  1. Any white-space character is allowed after a token (token being the "[" in your case)
  2. Any amount of white-space characters should be treated as 1 between the tokens.

Meaning that this should match:

ByteRange [      0    1 2
           33333    ]

And it won't, unless our check is converted to a regex. A PR would be welcome, @waaronking. I would move the matching into a separate function so that it is easily and separately testable.

waaronking commented 3 years ago

Good find on the PDF spec. And that sounds good @vbuch. I'm working on a PR with the matching regex right now. Per the structure of your code, I'm planning on adding the matching function as a separate helper function. Such as _helpers.findByteRange? Does that feel like the right place?

waaronking commented 3 years ago

Let me know what you think of https://github.com/vbuch/node-signpdf/pull/100

vbuch commented 3 years ago

Great contribution! Thank you! It has been merged in develop. I'll make sure I release it today or tomorrow. Thanks again.

waaronking commented 3 years ago

Great, thanks for the help!

eldareini commented 2 years ago

Hi, I have the same problem. I'm trying to sign an encrypted pdf that can't be modified with p12 certificate. if I'm signing the pdf without encryption and permissions, I am getting the pdf as I was supposed to. when I add the encryption and permissions, I am getting a blank page.

my code:

      pdfBuffer=pdfKitBuffer;
      pdfBuffer=plainAddPlaceholder({
        pdfBuffer,
        reason: 'reason',
        location: 'location',
        signatureLength: 2612
      });
      pdfBuffer=signer.sign(pdfBuffer,p12Buffer,{passphrase: pdfCertificatePassword}); 
waaronking commented 2 years ago

@eldareini this is per my experience, but the order of operations for password-protecting (encrypting) and signing the PDF is essential. If you do it in the wrong order, you'll end up corrupting the PDF resulting in blank pages (as outlined in Test 2).

Test 3 above does work following this order:

  1. Generate PDFBuffer
  2. Add the placeholder (via node-signpdf)
  3. Password protect (encrypt) the PDF (via tool like QPDF)
  4. Sign the PDF (via node-signpdf)

If you encrypt the PDF before adding a signature placeholder, you'll end up with blank pages. If you encrypt the PDF after signing it, it will invalidate your signature. Which tool are you using to password protect (encrypt) your PDF?