CommonPdf wraps a small subset of command line pdf utilities [ pdftk, PoDoFo ] aiming to provide performant pdf operations in node.js applications. Though not necessary CommonPdf assumes execution in AWS Lambda. Instructions for setup are described below.
example AWS Lambda usage CommonPdf Example api documentation here
All required binaries are provided in the bin directory (binaries are for linux). You may use what is provided or install separately.
Self Install:
Or use CommonPdf.setup (AWS Lambda Only!)
import {setup} from "commonpdf"
setup()
Install CommonPdf
npm i -S commonpdf
npm test
const Rotate = require( 'commonpdf' ).Rotate,
Concat = require( 'commonpdf' ).Concat,
FillForm = require( 'commonpdf' ).FillForm,
Stamp = require( 'commonpdf' ).Stamp
All classes expose a write()
method, and every write returns a promise. The resolved promise will contain
the file path to the newly written file.
File names can either be passed in as an optional parameter or if undefined a unique name will be generated. This is done to avoid name conflicts in AWS Lambda (files written to /tmp may persist across multiple function invocations).
Documentation is currently vague and incomplete. Until this has been remedied (better docs is my current milestone) exploring the .spec.js files in the /src directory will provide working examples that you may follow.
Concat accepts an array of pdf file paths. Concat can also be used to split a document. Given a single Pdf input
define optional parameter Array<{start:number, end:number|string}>
const Concat = require( 'commonpdf' ).Concat,
pdfs = ['fileA.pdf', 'fileB.pdf'],
opts = '/outfile.pdf'
new Concat(pdfs, opts /*optional*/)
.write()
.then(outfilePath => {
// do something
})
FillForm requires FdfGenerator, this class will be briefly covered here, for more info look at fdf-generator.spec
const FillForm = require('commonpdf').FillForm,
FdfGenerator = require('commonpdf').FDFGenerator,
fdfParameters = [
{fieldname:'hierarchical field name', fieldvalue:'a string value'},
{fieldname:'abutton', fieldvalue:true/false}
],
pdfFilePath = '/path/to/target.pdf'
const fdf = new FdfGenerator(pdfFilePath, fdfParameters)
fdf.write()
.then(fdfFile => {
return new FillForm(fdfFile, pdfFilePath).write()
})
.then(outFilePath => {
// do something will filled pdf form
})
const Stamp = require('commonpdf').Stamp,
img = 'data:image/png;base64,.....',
pdf = 'path/to/pdf',
pageNumber = 1,
dimensions = {width:100, height:100, x:100, y:100}
new Stamp(pdf).write(img, pageNumber, dimensions)
.then(outfile => {
// do something with newly stamped pdf
})
const Rotate = require('commonpdf').Rotate,
pdf = 'pdf/file/path',
pageNumber = 2,
config = {direction:'east'}
new Rotate(pdf, pageNumber, config)
.write()
.then(outfile => {
// do something
})
CommonPdf Sign creates a non-visible (no image yet) digital signature. Signing is performed with PoDoFo, Signing requires a cert and key is provided. To create your own self signed x509:
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 1826 -key ca.key -out ca.crt
Click here for further information on creating your own certificate.
pdftk in.pdf output out.pdf owner_pw PASSWORD-HERE
qpdf --password=PASSWORD-HERE --decrypt in.pdf out.pdf
AWS Lambda supports binary dependencies by allowing them to be included in uploaded ZIP files. However, because Amazon Linux does not support PDFtk or GCJ, PDFtk was built from source in CentOS, a close relative of Amazon Linux. I spun up a CentOS 6 machine in EC2 and followed the instructions on the PDFtk website to build PDFtk from source.
sudo yum install gcc gcc-java libgcj libgcj-devel gcc-c++
wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip
unzip pdftk-2.02-src.zip
cd pdftk-2.02-dist/pdftk
make -f Makefile.Redhat
sudo make -f Makefile.Redhat install
Then I copied the resulting pdftk
binary and /usr/lib64/libgcj.so.10
shared library into the bin/
directory of my Lambda project.
The entry point to the lambda function, index.ts
, alters the PATH
and LD_LIBRARY_PATH
environment variables to let the system know where to find the binary and the GCJ dependency.
It should be possible to use the PDFtk binary and GCJ shared library located in the bin/
directory of this file to run PDFtk in Amazon Linux on EC2. Simply copy them onto the machine and put them in the correct path, or call them directly:
LD_LIBRARY_PATH=/path/to/libgcj.so.10 /path/to/pdftk --version
let exec = require('child_process').exec;
// Set the PATH and LD_LIBRARY_PATH environment variables.
process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'] + '/bin';
process.env['LD_LIBRARY_PATH'] = process.env['LAMBDA_TASK_ROOT'] + '/bin';
exports.handler = function handler (event, context) {
exec('pdftk --version', context.done);
};
install deps: sudo yum update sudo yum install openssl-devel libidn-devel libjpeg-turbo-devel libtiff-devel libpng-devel lua-devel freetype-devel fontconfig-devel cppunit-devel wget clang Build cmake from source. The current latest version of cmake on centos(and amazon linux) is 2.8