tcr / scissors

PDF manipulation in Node.js! Split, join, crop, read, extract, boil, mash, stick them in a stew.
Apache License 2.0
285 stars 45 forks source link

Extract content #1

Open tcr opened 11 years ago

tcr commented 11 years ago

Using ps2ascii content can be extracted in the form of text, images, and fills. Probably need to align this information onto a grid with a certain fineness:

Spindrift
.extract([grid fineness]) // grid fineness creates 

Group
.commands() => [<command>, <command>]
.bound(l, b, r, t) => group()
.rows() => [<group>, <group>] // tries to automatically determine "rows" of elements
.columns() => [<group>, <group>] // tries to automatically determine "columns" of elements
.text() // plaintext
.images() // images