Spinoff from the notes in #159 lest I forget about the idea of a user driven manual edit/correct process for the OCR-ed text: this would improve the text layer of the documents and thus the search index and any exported PDF if we incorporate that hOCR text layer in the output PDFs (currently we copy the original PDFs to the export directory as that's the only PDFs we currently have)
Spinoff from the notes in #159 lest I forget about the idea of a user driven manual edit/correct process for the OCR-ed text: this would improve the text layer of the documents and thus the search index and any exported PDF if we incorporate that hOCR text layer in the output PDFs (currently we copy the original PDFs to the export directory as that's the only PDFs we currently have)