pedro-lucas / node-pdfbox

Node bridge module to library PDFBox
MIT License
19 stars 13 forks source link

How to do "ExtractImages"? #5

Open androidkencai opened 7 years ago

androidkencai commented 7 years ago

we can extract images in a pdf file by using the script below java -jar pdfbox-app-2.0.0.jar ExtractImages pdfWithImages.pdf

I tried let image = page.getImageSync(); it converted the entire pdf to image but not extract images inside. How to do ExtractImages?

thanks! KC

vaibhavbhuva commented 5 years ago
let document = Document.loadSync(`PDF_PATH`);
let numberOfPages = document.pagesCountSync();
for (let pageNumber = 0; pageNumber < numberOfPages; pageNumber++) {
  var page = document.getPageSync(pageNumber);
  let image = page.getImageSync(scale);
  image.saveSync(`DESTINATION_PATH`, 'jpg');
}