documentcloud / docsplit

Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.io/docsplit/
Other
832 stars 214 forks source link

==


____/ /___  ______________  / (_) /_

/ / \/ / / \/ / / / / // / // / /( ) // / / / /
_/____/\/_/ .///_/
/
/

Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)

Installation: gem install docsplit

For documentation, usage, and examples, see: https://documentcloud.github.io/docsplit/

To suggest a feature or report a bug: http://github.com/documentcloud/docsplit/issues/