Open AvtechScientific opened 3 years ago
Yes, this is one of basic features neccesary for OCR program. If it will get added I can donate to support development. Just make simple gui to modify tesseract configuration file with short description of parameter on hover.
Probably the fastest way to achieve this is if someone contributed the code via PR. On my part I won't have the capacity to work on this in the near future.
I created a simple Python script that extracts the boxes from the HTML file. In gImageReader you should export the edited image as HTML and then use the script to extract the boxes: https://github.com/khashashin/chechen_ocr
It would be nice to have GUI elements that would assist in fine tuning/teaching Tesseract on scanned images. Similar to what jTessBoxEditor does, as described in this article[^*]. Mainly creating the .tiff and .box files...
[^*]: not all the commands listed in the article worked for me. Here are those corrected by me a bit: