foliojs / pdfkit

A JavaScript PDF generation library for Node and the browser
http://pdfkit.org/
MIT License
9.82k stars 1.15k forks source link

Embedding image ICC profiles into PDF #1434

Open xh4010 opened 1 year ago

xh4010 commented 1 year ago

This is a great library that I use to generate high-quality PDF files for printing. Unfortunately, my JPG images usually contain ICC color profiles, but pdfkit does not have the ability to embed ICC profiles into PDF files, nor does it provide a method to do so. Therefore, I have been trying to find alternative solutions.

I have created a function called getICC that can extract ICC data from both JPEG and PNG images, and it has been tested successfully. I attempted to integrate the getICC function into pdfkit by modifying the lib/image/jpeg.js file JPEG class in order to embed the extracted ICC data into the PDF file, but was unsuccessful.

class JPEG {
  constructor(data, label) {
    ...
    this.channels = this.data[pos++];
    this.colorSpace = COLOR_SPACE_MAP[this.channels]

    this.obj = null;
  }

  embed(document) {
    if (this.obj) {
      return;
    }
    let colorSpace=this.colorSpace;
    getICC(this.data).then(icc=>{    //extract ICC if existed
      this.icc=icc;
    })
    if(!!this.icc){    //embed ICC
      this.icc.obj=document.ref({
        Alternate: colorSpace,
        Filter: 'FlateDecode',
        N: this.channels,
        Length: this.icc.buffer.length
      })
      this.icc.obj.write(this.icc.buffer);
      this.icc.obj.end();
      this.icc.based=document.ref({
        ICCBased:this.icc.obj
      })
      this.icc.based.end();
      colorSpace=this.icc.based;
      this.colorSpace='ICCBased'
    }
    this.obj = document.ref({
      Type: 'XObject',
      Subtype: 'Image',
      BitsPerComponent: this.bits,
      Width: this.width,
      Height: this.height,
      ColorSpace: colorSpace,
      Filter: 'DCTDecode'
    });
    ...
    // free memory
    return (this.data = null);
  }
}

The problem is that getICC is an asynchronous function, while almost all of the functions in pdfkit are synchronous, so I cannot use the await keyword to get the return value of the getICC function. As a result, the this.icc variable is not assigned a value.

xh4010 commented 1 year ago

By adding two asynchronous methods, imageWithICC and embedWithICC, it is now possible to embed the ICC configuration file into a PDF file if the JPG image contains one.

const PDFDocument=require('..');
const fs=require('fs');
(async()=>{
  let doc = new PDFDocument();
  doc.pipe(fs.createWriteStream('embedicc.pdf'));
  doc
    .text('noICC', 40, 50)
    .image('images/landscape.jpg', 40, 70, {
      width: 200,
      height: 267
    })
    .text('embedICC', 280, 50);
  await doc.imageWithICC('images/landscape+icc.jpg', 280, 70, {
    width: 200,
    height: 267
  });
  doc.end();
})()

PDF result: