nisaacson / pdf-extract

Node PDF Extract
MIT License
383 stars 76 forks source link

Add options to use with pdftotext: encoding and mode #32

Closed nsacerdote closed 5 years ago

nsacerdote commented 5 years ago

I was having someissues with special characters like "ü", after some research, the problem seems to be solved specifying the encoding on pdftotext, since there was no way to do it (AFAICS), I prepared this pull request.

Edit: I was also having issues with tables, setting the mode to -table solved my problem. I added an option to specify the mode as well.

See https://www.xpdfreader.com/pdftotext-man.html

nisaacson commented 5 years ago

thanks for the PR!

nsacerdote commented 5 years ago

You're welcome! Thank you for building this library!