digitorus / pdfsign

Add/verify Advanced Electronic Signature (AES) and Qualified Electronic Signature (QES) in PDF (usign pure Go)
BSD 2-Clause "Simplified" License
74 stars 16 forks source link

Signing a PDF with multiple revisions results in malformed xref table #4

Closed mpldr closed 4 months ago

mpldr commented 1 year ago

With the following code, the signed PDF is invalid.

main.go package main import ( "crypto" "crypto/rsa" "crypto/x509" "encoding/pem" "fmt" "log" "os" "time" "github.com/digitorus/pdfsign/sign" cli "github.com/jawher/mow.cli" ) func main() { app := cli.App("pdfsigner", "sign pdfs with ease") app.Spec = "[--cert][--key] [--name][--location][--reason][--contact] INPUT OUTPUT" cert := app.StringOpt("cert", "./cert.pem", "PEM-encoded certificate file") key := app.StringOpt("key", "./privkey.pem", "PEM-encoded private key") name := app.StringOpt("name", "Accounting", "name of the signer") location := app.StringOpt("location", "Company\nStreet Address\nPostcode Gröditz", "PEM-encoded private key") reason := app.StringOpt("reason", "authenticating document validity", "PEM-encoded private key") contact := app.StringOpt("contact", "Head Accountant", "PEM-encoded private key") sdata := sign.SignDataSignatureInfo{ Name: *name, Location: *location, Reason: *reason, ContactInfo: *contact, Date: time.Now().Local(), } input := app.StringArg("INPUT", "", "the file to sign") output := app.StringArg("OUTPUT", "", "the path for the signed pdf") app.Action = runSign(cert, key, input, output, sdata) app.Run(os.Args) } func runSign( certPath *string, keyPath *string, in *string, out *string, signatureInformation sign.SignDataSignatureInfo, ) func() { return func() { cert, err := readCert(*certPath) if err != nil { log.Fatalf("failed to parse certificate: %v", err) } key, err := readKey(*keyPath) if err != nil { log.Fatalf("failed to parse private key: %v", err) } err = sign.SignFile(*in, *out, sign.SignData{ Signature: sign.SignDataSignature{ Info: signatureInformation, CertType: sign.CertificationSignature, DocMDPPerm: sign.AllowFillingExistingFormFieldsAndSignaturesAndCRUDAnnotationsPerms, }, Signer: key, DigestAlgorithm: crypto.SHA256, Certificate: cert, RevocationFunction: sign.DefaultEmbedRevocationStatusFunction, }) if err != nil { log.Fatalf("failed to sign file: %v", err) } } } func readCert(certPath string) (*x509.Certificate, error) { certContent, err := os.ReadFile(certPath) if err != nil { return nil, fmt.Errorf("failed to read file '%s': %w", certPath, err) } certData, _ := pem.Decode(certContent) if certData == nil { return nil, fmt.Errorf("failed to parse PEM encoded data") } cert, err := x509.ParseCertificate(certData.Bytes) if err != nil { return nil, fmt.Errorf("failed to parse certificate data: %w", err) } return cert, nil } func readKey(keyPath string) (*rsa.PrivateKey, error) { keyContent, err := os.ReadFile(keyPath) if err != nil { return nil, fmt.Errorf("failed to read file '%s': %w", keyPath, err) } keyData, _ := pem.Decode(keyContent) if keyData == nil { return nil, fmt.Errorf("failed to parse PEM encoded data") } key, err := x509.ParsePKCS8PrivateKey(keyData.Bytes) if err != nil { return nil, fmt.Errorf("failed to parse private key data: %w", err) } switch key.(type) { case *rsa.PrivateKey: return key.(*rsa.PrivateKey), nil default: return nil, fmt.Errorf("key is not a RSA key. Sorry, but we need the ancient ones. (it's type %T)", key) } }

When you run pdfsign verify Acrobat_DigitalSignatures_in_PDF_signed.pdf the error "Failed to open file: malformed PDF: malformed xref table" is returned.

Original: Acrobat_DigitalSignatures_in_PDF.pdf Signed: Acrobat_DigitalSignatures_in_PDF_signed.pdf

vanbroup commented 1 year ago

I can confirm this happens with some files but have not been able to identify the cause yet.

vanbroup commented 1 year ago

Signing fails if the document has more than one xref table (revisions).

vanbroup commented 1 year ago

This item count includes all objects of the document: https://github.com/digitorus/pdfsign/blob/4fb6fafba6e3fb25d7bd00b71ba5a477226d1188/sign/pdfxref.go#L32

But only the (latest?) xref table entries are included in the xref new table.

Using the correct count for the xref does not seem to fix the problem.

Adding a new referencing xref table instead of including all items from the last table seems to be a cleaner approach but has not resulted in any success for now.

The ReaderXrefInformation only contains information about the entire document and probably needs to be updated to contain information about each xref table or stream: https://github.com/digitorus/pdf/blob/master/read.go#L92

dhernandez commented 8 months ago

This code at the writeXrefTable function makes the trick, but it probably is not the best option:

func (context *SignContext) writeXrefTable() error {
    context.InputFile.Seek(context.PDFReader.XrefInformation.StartPos, 0)
    scanner := bufio.NewScanner(context.InputFile)
    scanner.Split(bufio.ScanWords)
    scanner.Scan() // This should be xref string
    scanner.Scan()
    first_object_id := scanner.Text()
    scanner.Scan()
    xref_items := scanner.Text()
    xref_items_i, _ := strconv.Atoi(xref_items)

    xref_size := fmt.Sprintf("xref\n%s %s\n", first_object_id, xref_items)
    new_xref_size := fmt.Sprintf("xref\n%s %s\n", first_object_id, fmt.Sprint(xref_items_i+4))
...
}