documentcloud / docsplit

Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.com/docsplit/
Other
833 stars 214 forks source link

added functionality to pass pdftotext options #114

Open narutosanjiv opened 10 years ago

nofxx commented 8 years ago

+1 ! Please merge this. It's impossible to parse a table or any kind of horizontal text layout without this patch.

http://stackoverflow.com/questions/29924528/docsplit-gem-pdf-to-text

For keywords: parse pdf table ruby, parse pdf horizontal text