AckerApple / pdfbox-cli-wrap

A wrapper for making PDFBox CLI commands
MIT License
5 stars 5 forks source link

pdfbox-cli-wrap

A wrapper for making PDFBox CLI commands

hire me npm downloads build status Build status NPM version dependencies

This package allows for the following PDF functionality:

Table of Contents

Purpose

Connect to Java and the PDFBox library to allow Node code to perform perfected PDF management techinques.

If you've looked into documentation for secure storage of PDFs, you know you need certificate based security for your PDFs. Next to no public Node libraries have certificate based PDF encryption software.

Examples

PDF to One Image

Create one image for the first page of a PDF document. Use pdfToImages to makes images of other pages

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')

pdfboxCliWrap.pdfToImage(readablePdf)
.then(imgPath=>{
  console.log('jpg image created at: '+imgPath)
})
.catch(e=>console.error(e))

PDF To Images

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')

pdfboxCliWrap.pdfToImages(readablePdf)
.then(pathArray=>{
  console.log('pdf rendered to files here:', pathArray)
})
.catch(e=>console.error(e))

Add Images

Insert one image at one specific location, or append multiple images, and more...

Example Insert Image File into Page

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')
const options = {x:200, y:200, page:0, width:100, height:100}

pdfboxCliWrap.addImages(readablePdf, imgPath0, options)
.then(()=>{
  console.log("Image Inserted")
})
.catch(e=>console.error(e))

Example Insert Image Base64 into Page

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')
const options = {x:200, y:200, page:0, width:100, height:100}

pdfboxCliWrap.addImages(readablePdf, 'data:image/png;base64,...', options)
.then(()=>{
  console.log("Base64 Image Inserted")
})
.catch(e=>console.error(e))

Example Append Images as Pages

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')
const options = {
  y:-1,//very top of page
  page:-1,//a new page will be created for image insert
  width:'100%'//width of page to image width will be calclated into a constrained size
}

pdfboxCliWrap.addImages(readablePdf,[imgPath0, imgPath1], options)
.then(()=>{
  console.log("Images Added as Pages to Original PDF File")
})
.catch(e=>console.error(e))

Read Acroform

Read PDf form fields as an array of objects

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')

pdfboxCliWrap.getFormFields(readablePdf)
.then(fields=>{
  console.log(fields)
})
.catch(e=>console.error(e))

The result of getFormFields will look like the following JSON

[{
  "fullyQualifiedName": "form1[0].#subform[6].FamilyName[0]",
  "isReadOnly": false,
  "partialName": "FamilyName[0]",
  "type": "org.apache.pdfbox.pdmodel.interactive.form.PDTextField",
  "isRequired": false,
  "page": 6,
  "cords": {
    "x": "39.484",
    "y": "597.929",
    "width": "174.00198",
    "height": "15.119995"
  },
  "value": "Apple"
}]

Fill Acroform

Fill PDf form fields from an array of objects

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')
const outPdfPath = path.join(__dirname,'filled.pdf')

//array of field values
const data = [{
  "fullyQualifiedName": "Your_FirstName",
  "value": "Acker"
}]

pdfboxCliWrap.embedFormFields(readablePdf, data, outPdfPath, {flatten:true})
.then(()=>{
  console.log("success")
})
.catch(e=>console.error(e))

Advanced Fill Acroform

The JSON file below will fill two fields:

const pdfboxCliWrap = require('pdfbox-cli-wrap')
const readablePdf = path.join(__dirname,'readable.pdf')
const outPdfPath = path.join(__dirname,'filled.pdf')

//array of field values
const data = [{
  "fullyQualifiedName": "form1[0].#subform[6].FamilyName[0]",
  "isReadOnly": false,
  "partialName": "FamilyName[0]",
  "type": "org.apache.pdfbox.pdmodel.interactive.form.PDTextField",
  "isRequired": false,
  "page": 6,
  "cords": {
    "x": "39.484",
    "y": "597.929",
    "width": "174.00198",
    "height": "15.119995"
  },
  "value": "Apple"
},{
  "fullyQualifiedName": "form1[0].#subform[6].EmployeeSignature[0]",
  "isReadOnly": true,
  "partialName": "EmployeeSignature[0]",
  "type": "org.apache.pdfbox.pdmodel.interactive.form.PDTextField",
  "isRequired": false,
  "page": 6,
  "cords": {
    "x": "126.964",
    "y": "227.523",
    "width": "283.394",
    "height": "15.12001"
  },
  "remove": true,
  "base64Overlay": {
    "uri": "......................=",
    "forceWidthHeight": true
  }
}]

pdfboxCliWrap.embedFormFields(readablePdf, data, outPdfPath)
.then(()=>{
  console.log("success")
})
.catch(e=>console.error(e))

Embed Timestamp Signature

const pdfboxCliWrap = require('pdfbox-cli-wrap')

//create paths to pdf files
const path = require('path')
const inPath = path.join(__dirname,'unencrypted.pdf')
const key = path.join(__dirname,'pdfbox-test.p12')

pdfboxCliWrap.signToBuffer(inPath)
.then(buffer=>console.log('signed!'))
.catch(e=>console.error('failed to sign'))

Encrypt Decrypt by Password

A great place to start before moving on to certificate based cryptography

const pdfboxCliWrap = require('pdfbox-cli-wrap')

//create paths to pdf files
const path = require('path')
const inPath = path.join(__dirname,'unencrypted.pdf')
const toPath = path.join(__dirname,'encrypted.pdf')
const decryptTo = path.join(__dirname,'unencrypted2.pdf')

//encrypt
let promise = pdfboxCliWrap.encrypt(inPath, toPath, {'password':'123abc'})
.then( ()=>console.log('encryption success!') )

//decrypt
promise.then( ()=>pdfboxCliWrap.decrypt(toPath, , {'password':'123abc'}) )
.then( ()=>console.log('decryption success!') )
.catch( e=>console.log(e) )

Encrypt Decrypt by Certificate

This is where the money is

const pdfboxCliWrap = require('pdfbox-cli-wrap')

//create paths to pdf files
const path = require('path')
const readablePdf = path.join(__dirname,'unencrypted.pdf')
const encryptTo = path.join(__dirname,'encrypted.pdf')
const decryptTo = path.join(__dirname,'unencrypted2.pdf')

//create paths to secret files
const cert = path.join(__dirname,'pdfbox-test.crt')
const key = path.join(__dirname,'pdfbox-test.p12')

//encrypt from readable pdf to unreadable pdf
let promise = pdfboxCliWrap.encrypt(readablePdf, encryptTo, {'certFile':cert})
.then( ()=>console.log('encryption success!') )

//how to decrypt
const decOptions = {
  keyStore:key,//Special file that is password protected. The contents are both the certificate and privatekey.
  password:'password'//unlocks the keyStore file
}

promise.then( ()=>pdfboxCliWrap.decrypt(encryptTo, decryptTo, decOptions) )
.then( ()=>console.log('decryption success!') )
.catch( e=>console.log(e) )

Learn how to generate .crt and .p12 files here

Installation

This package is a wrapper for making CLI commands to Java. A few things are going to be needed.

An updated version of NodeJs that supports ecma6 syntax is required. I believe version 4.0.0 or greater will do. I am in Node 7.0.0, as of this writing, my how time and version numbers can fly.

Install Java

Download and Install Java and be sure the following command works without error in a command terminal:

java -version

Certificate Based Encrypt Decrypt Install Requirements

Are you going to be encrypting and possibly decrypting PDF documents?

This is a 1 step process (maybe 2):

Add BouncyCastle into Java Security Extensions

BouncyCastle is the big daddy of cryptography libraries, for Java and PDFBox

This step is no longer necessary, IN MOST CASES, as BouncyCastle is now bundled with pdfbox-cli-wrap. If you need the BouncyCastle installation documentation, it can be found here

Bouncy Castle is not a registered provider, when errors like the following occur: java.io.IOException: Could not find a suitable javax.crypto provider

Generate Certificates and KeyStore

Get ready to run terminal commands against Java's keytool. Fun fun

In a terminal command prompt window, run the following in a folder where certificate files can live

Step #1 Create keyStore

keytool -genkey -keyalg RSA -alias pdfbox-test-alias -keystore pdfbox-test-keystore.jks -storepass pdfbox-test-password -validity 360 -keysize 2048

creates file pdfbox-test-keystore.jks

Step #2 Create a selfsigned certificate

keytool -export -alias pdfbox-test-alias -file pdfbox-test.crt -keystore pdfbox-test-keystore.jks

creates file pdfbox-test.crt

Step #3 Marry the certificate and keyStore together as a .p12 file

keytool -importkeystore -srckeystore pdfbox-test-keystore.jks -destkeystore pdfbox-test.p12 -srcstoretype JKS -deststoretype PKCS12 -deststorepass pdfbox-test-password -srcalias pdfbox-test-alias -destalias pdfbox-test-p12

creates file pdfbox-test.p12

You should now have the following files in targeted folder:

MAY Need Java Cryptography

Depending on your level of advanced encryption needs, you (may) need to install Java Cryptography

Test Installation

In the root folder of pdfbox-cli-wrap, in a terminal command window, you can test your installation

Step 1, Install the Test Dependencies (Mocha)

npm install

The pdfbox-cli-wrap folder should now have a folder named "node_modules" with a folder named "mocha"

Step 2, Run the Test

npm test

Documentation

getFormFields

examples

getFormFieldsAsObject

Read Acroform fields from a PDF as object-of-objects where each key is the fullyQualifiedName of input field

examples

embedFormFields

Takes array-of-objects and sets values of PDF Acroform fields

examples

embedFormFieldsByObject

Fill Acroform fields from a PDF with an array-of-objects to set the values of input fields

examples

sign

Will embed timestamp signature with optional TSA option

examples

signToBuffer

See sign

pdfboxCliWrap.signToBuffer(path,outPath,options).then(buffer).catch()

signByBuffer

See sign

pdfboxCliWrap.signByBuffer(buffer,options).then(buffer).catch()

encrypt

Will encrypt a PDF document

examples

decrypt

Will decrypt a PDF document

examples

pdfToImages

Will create an image for any or every page in a PDF document.

examples

pdfToImage

Will create one image for the first page of a PDF document. Use pdfToImages to makes images of other pages

examples

addImages

Insert a single image into a PDF or append multi images as individual pages

examples

Resources

Credits