Using ocrmypdf by default if it is found usually makes sense, however if trying to run remarks on a large collection of pdfs, it can take quite a while....
This PR adds the ability for the user to specific if they want to use the OCR functionality. It is set to false by default for quicker runs.
Turned on with --use-ocr/-ocr
An error is also thrown if the flag is set but the tool is not found.
@lucasrla for review
As an aside, I've tried to make this a bit faster by parallelizing calls to ocrmypdf but I ended up running into some segfaults. I'll revisit that at some point, but in case, the sane default should be set to false to encourage quicker runs (at the expense of some accuracy)...
Using
ocrmypdf
by default if it is found usually makes sense, however if trying to run remarks on a large collection of pdfs, it can take quite a while....This PR adds the ability for the user to specific if they want to use the OCR functionality. It is set to false by default for quicker runs.
Turned on with
--use-ocr/-ocr
An error is also thrown if the flag is set but the tool is not found.
@lucasrla for review
As an aside, I've tried to make this a bit faster by parallelizing calls to ocrmypdf but I ended up running into some segfaults. I'll revisit that at some point, but in case, the sane default should be set to false to encourage quicker runs (at the expense of some accuracy)...