Closed ryanfb closed 9 years ago
Closing due to 82d3177429d4dcf7300aa54398ea98c617024f36.
Thank you so much for this! As you saw, I went ahead and implemented a larger rewrite of the pre-extracted line reading/writing functionality.
The Tesseract code is also great to have, and something we've been talking about. I made some changes to your code, and I'll send you a pull-request on your repo, but would you be interested in incorporating it into the main Ocular repo? Additionally, would it be possible for you to throw together some instructions explaining the minimum necessary requirements needed to get someone to where they can compile the c++ file?
Is there more info needed for compiling the C++ file? I believe using pkg-config
in the Makefile should handle compiling on Linux/Mac if (lib)tesseract/(lib)leptonica are installed via most package managers.
With this patch if I specify e.g.
-inputPath images/pugna -existingExtractionsPath pre_ex
, it will look inpre_ex/line_extract/pugna
for existing extractions. It should also fall back to normal extraction if no extractions are found.I've also made a simple Tesseract line splitter whose output works with this option.