garysieling / pdf-js-csv

Exploring extracting tables from a PDF to CSV using PDF.JS
http://garysieling.com/blog/extracting-tables-from-pdfs-in-javascript-with-pdf-js
104 stars 26 forks source link

pdf-js-csv

How It Works:

Extracting tables

http://garysieling.com/blog/extracting-tables-from-pdfs-in-javascript-with-pdf-js

Loading files in PDF.js using PhantomJS

http://www.garysieling.com/blog/integrating-phantomjs-and-pdf-js-inter-process-communication pdf-js-csv

Quick Start

npm install pdf2csv --no-bin-links

wget https://github.com/garysieling/pdf-js-csv/raw/master/examples/tests.pdf --no-check-certificate

wget https://raw.github.com/garysieling/pdf-js-csv/master/main.js --no-check-certificate

node main tests.pdf output.csv