MatthiasValvekens / pyHanko

pyHanko: sign and stamp PDF files
MIT License
483 stars 71 forks source link

Add digital signature is broken for PDF file larger than 100 000 000 bytes #336

Closed msongd closed 10 months ago

msongd commented 10 months ago

Describe the bug When adding digital signature to PDF which is larger than 100 000 000 bytes, pyHanko creates corrupted PDF. It returns success, the file can be viewed normally in Adobe Acrobat Reader, but the digital signature is not shown.

This is the command I used:

% pyhanko --verbose sign addsig --field Sig1 pemder --no-pass --key selfsigned-key.pem --cert selfsigned.pem ~/Downloads/y2.pdf bad.pdf
2023-11-02 10:40:58,491 - root - DEBUG - Running with --verbose
2023-11-02 10:40:58,492 - root - DEBUG - There was no configuration to parse.
2023-11-02 10:40:58,558 - asyncio - DEBUG - Using selector: KqueueSelector
2023-11-02 10:40:58,561 - tzlocal - DEBUG - /etc/localtime found
2023-11-02 10:40:58,562 - tzlocal - DEBUG - 1 found:
 {'/etc/localtime is a symlink to': 'Asia/Ho_Chi_Minh'}

Another PDF file having size less than 100 000 000 bytes signed using the same command and cert shows the digital signature normally in Adobe Acrobat Reader.

To Reproduce

% pyhanko --verbose sign addsig --field Sig1 pemder --no-pass --key selfsigned-key.pem --cert selfsigned.pem ~/Downloads/y2.pdf bad.pdf
2023-11-02 10:40:58,491 - root - DEBUG - Running with --verbose
2023-11-02 10:40:58,492 - root - DEBUG - There was no configuration to parse.
2023-11-02 10:40:58,558 - asyncio - DEBUG - Using selector: KqueueSelector
2023-11-02 10:40:58,561 - tzlocal - DEBUG - /etc/localtime found
2023-11-02 10:40:58,562 - tzlocal - DEBUG - 1 found:
 {'/etc/localtime is a symlink to': 'Asia/Ho_Chi_Minh'}

Expected behavior Output PDF should show digital signature in Adobe Acrobat Reader.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context Validate bad output PDF file with pdfcpu return this error:

% pdfcpu validate bad.pdf
validating(mode=relaxed) bad.pdf ...
dereferenceObject: problem dereferencing object 11036: pdfcpu: parse: corrupt name object

I can provide sample PDF, but it is quite large (100264574 bytes), please advise me how to send the file. A sample 100MB large PDF file from https://testfile.org/all-pdf-sample-test-file-download-direct/ can also be used to trigger this behavior.

MatthiasValvekens commented 10 months ago

Interesting. I haven't tried to repro this yet, but I bet this is the problem: https://github.com/MatthiasValvekens/pyHanko/blob/e13f7c1c803b0ea753a8b34a44f2c12018d79d06/pyhanko/sign/signers/pdf_byterange.py#L73-L78.

The fix should be straightforward, though. Good catch!

MatthiasValvekens commented 10 months ago

Can you try again with the version on the master branch?

msongd commented 10 months ago

Thank you very much. The master branch works now.