MatthiasValvekens / pyHanko

pyHanko: sign and stamp PDF files
MIT License
494 stars 71 forks source link

PDF signing breaks if no fields object in Acroform #403

Closed yash-lz closed 6 months ago

yash-lz commented 6 months ago

Describe the bug There are some pdfs (maybe generated by IText) which sometimes don't have the 'Fields' array(key). Standard PDF format: {'DA': b'/Helv0 0 Tf 0 g', 'DR': {'Font': {'Helv': <PDFObjRef:5>, 'Helv0': <PDFObjRef:5>}}, 'Fields': []} Failed PDF format: (Fields key is missing) {'DA': b'/Helv0 0 Tf 0 g', 'DR': {'Font': {'Helv': <PDFObjRef:5>, 'Helv0': <PDFObjRef:5>}}}

Due to this, when we we try to append a signature field, which then calls prepare_sig_field, we hit the following code and error out with '/AcroForm has no /Fields'.

form = root['/AcroForm']

try:
    fields = form['/Fields']
except KeyError:
    PdfError('/AcroForm has no /Fields')

To Reproduce Select a PDF with no Fields array. May happen when we try and flatten some pdfs created from iText. Cannot share the exact PDF due to privacy issues

Expected behavior We should be able to add the signature field even if there is no Fields array. Create the fields array as we are doing below while creating the form when it isn't present.

try:
    fields = form['/Fields']
except KeyError:
    fields = generic.ArrayObject()
    form[pdf_name('/Fields')] = fields

Screenshots N/A

Environment (please complete the following information): Don't matter

Additional context

MatthiasValvekens commented 6 months ago

Hi, thanks for the report.

I don't fully recall why I added that check, probably had something to do with the modification tracking while doing incremental updates in the early days of the project (it was a lot more "manual" back then).

Anyway, I've removed it in #404, can you give that branch a go to see if it works for you?

yash-lz commented 6 months ago

Hi, thanks for the immediate fix.

I tried out the fix on the bugfix branch and it looks good. Cheers