Open k00ni opened 10 months ago
I switched from pdftotext
to PdfParser specifically so my search engine (that scans HTML and PDF files) could have an all PHP solution instead of requiring a binary. But a binary might be useful in other situations.
I think the key argument for/against would be: Can PdfParser do a better job than pdftotext
? It's a pretty mature product. https://www.xpdfreader.com/pdftotext-man.html
It can really make things easier in some areas. I have encountered a situation where I needed something like this a few times.
I am curious if there is a need for a (standalone) executable to get text from a given PDF?
It would be a PHP script still, but can be called in the terminal for shell related tasks. Maybe something like the following?
or
When running this command, the extracted text of
/foo/Bar.pdf
will be written topdf_text.txt
. But one could also use it to directly search in it via grep etc.If you need/want something like this please use emoticon :+1:, otherwise :-1:. Comments and ideas are welcome.
Thank you for taking the time.