ocrmypdf / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
http://ocrmypdf.readthedocs.io/
Mozilla Public License 2.0
14.13k stars 1.02k forks source link

Make ruffus pipeline re-entrant #29

Closed jbarlow83 closed 7 years ago

jbarlow83 commented 8 years ago

In its current form the pipeline is not re-entrant -- it is assembled based on command line arguments prior to main() and cannot be changed after that. As such, there is no value to "import ocrmypdf".

Also, all test cases need to run in a subprocess which is not ideal for inspecting test failures.

A re-entrant pipeline would make it possible to customize the pipeline if ocrmypdf were used as a library.

jbarlow83 commented 7 years ago

Pipeline is reentrant as of v4.4 but due to ruffus limitations this doesn't enable the API