foliojs / pdfkit

A JavaScript PDF generation library for Node and the browser
http://pdfkit.org/
MIT License
9.8k stars 1.14k forks source link

Finalized doc does not conform to PDF 1.7 standards #1548

Open fuzailgilani opened 2 weeks ago

fuzailgilani commented 2 weeks ago

Bug Report

Description of the problem

The PDF that is generated using PDFKit can be opened in most PDF readers (e.g. Preview, browsers, etc.), but when we try to open it in Adobe Acrobat, it complains that the file is corrupted, giving the error code 135, which indicates that the file does not conform to PDF 1.7 standards. I ran one of the files through an online tool to validate the standard and it gave the following results:

Compliance: pdf1.7
Result: Document does not conform to PDF/A.
Details:
Validating file "WithPDFKitv15NoCompression.pdf" for conformance level pdf1.7
    The "endobj" keyword is missing.
    The key OutputConditionIdentifier is required but missing.
    The value of the key Info must not be of type name.
    The key Info is required but missing.
    The key DestOutputProfile is required but missing.
    The embedded ICC profile couldn't be read.
    The document does not conform to the requested standard.
    The file format (header, trailer, objects, xref, streams) is corrupted.
    The document doesn't conform to the PDF reference (missing required entries, wrong value types, etc.).
    The document does not conform to the PDF 1.7 standard.
Done.

We first noticed the issue with PDFKit version 0.13.0, and thought maybe upgrading to the latest version 0.15.0 would fix it, but no luck.

Code sample

We have a couple thousand lines of code for PDF generation as it's pretty central to our application and there's a lot of branching logic, but for now I'll just include how we initialize the document:

    const pdfDoc = new PDFDocument({
      size: 'A4',
      pdfVersion: '1.7',
      bufferPages: true,
      margins: {
        top: MARGIN,
        bottom: MARGIN,
        left: MARGIN,
        right: MARGIN,
      },
      compress: false,
    });

    const bufferPromise = new Promise<Buffer>((resolve) => {
      const buffers = [];

      pdfDoc.on('data', buffers.push.bind(buffers));
      pdfDoc.on('end', () => {
        const pdfData = Buffer.concat(buffers);
        resolve(pdfData);
      });
    });

Your environment

fuzailgilani commented 2 weeks ago

Okay, we've figured out what the issue was. A few weeks back, we went through the entire project and fixed all the linter errors and warnings that the project had. Further down from the code I posted in the snippet above, we set up the ICC profile for the PDF like this:

    // PDF/A standard requires embedded color profile.
    const colorProfile = Buffer.from(SRGB_IEC61966_ICC_PROFILE, 'base64');
    const refColorProfile = doc.ref({
      Length: colorProfile.length,
      N: 3,
    });
    refColorProfile.write(colorProfile);
    refColorProfile.end('');

    const rgbString = 'sRGB IEC61966-2.1';
    const refOutputIntent = doc.ref({
      Type: 'OutputIntent',
      S: 'GTS_PDFA1',
      Info: rgbString,
      OutputConditionIdentifier: rgbString,
      DestOutputProfile: refColorProfile,
    });
    refOutputIntent.end('');

The problem was with the const rgbString. Before we did our linter fixes, that line was originally:

    const rgbString = new String('sRGB IEC61966-2.1');

Our linter didn't like it because of the rule no-new-wrappers but apparently that was necessary for PDFKit to not have errors. Which is very odd and should probably be handled better by the library. For now, we've just added an eslint-disable-rule for that line and reverted it back to the new String constructor.