galkahana / HummusJS

Node.js module for high performance creation, modification and parsing of PDF files and streams
http://www.pdfhummus.com
Other
1.14k stars 169 forks source link

Question: reading transformations from PDF #407

Open bebu259 opened 4 years ago

bebu259 commented 4 years ago

Hi,

I am using the code below to add an image to a PDF:

const pdfModifyWriter = hummus.createWriterToModify(inPdfPath, { modifiedFilePath: outPdfPath });
const pageModifier = new hummus.PDFPageModifier(pdfModifyWriter, 0);

const ctx = pageModifier.startContext().getContext();
ctx.drawImage(x, y, imagePath, { transformation: { width: w, height: w, proportional: true } });

pageModifier.endContext().writePage();
pdfModifyWriter.end();

This works well, but for some PDFs the image is mirrored on the x-axis. This can be fixed by adding a transform:

ctx.q();
ctx.cm(1, 0, 0, -1, 0, height);
//...
ctx.Q();

It looks like some input PDFs have some internal transformation that is causing this mirroring. So I need to inspect the PDFs for that. I use parsePage() to get information about the original PDF, but there is no transformation info in it.

Is there a way to inspect a PDF to see if it has some internal transformations?

Thanks!

galkahana commented 4 years ago

use true as a third param to PDFPageModifer constructor to cancel any existing transformation. i reckon this would be a shorter route then attempting to parse the transformation (though, yeah. it possible). see the text parsing sample for parsing page content - https://github.com/galkahana/HummusJSSamples/tree/master/text-extraction

bebu259 commented 4 years ago

That works great, thank you.