manisandro / gImageReader

A Gtk/Qt front-end to tesseract-ocr.
GNU General Public License v3.0
1.57k stars 187 forks source link

Not sure how to proceed with the workflow? #665

Closed jacob-44 closed 5 months ago

jacob-44 commented 5 months ago

Could someone please explain step by step how to OCR 50 images from the folder into one PDF doc as batch job? I am trying to use UI (various buttons) but I am not getting any results.

jacob-44 commented 5 months ago

When I use "HOCR Batch Export" feature it looks ONLY like settings because it has button 'Apply' but nothing happens, file is not exported.

I'd expect something like this:

Select files > Set-up export (settings) -> Run batch job -> Success

manisandro commented 5 months ago

Proceeed as follows:

  1. Open all the input files you wish to recognize in the sources pane
  2. Select all the files you want to recognize
  3. Select OCR mode "hOCR"
  4. In the recognition menu, select batch mode. This will save a hocr HTML file next to each image
  5. Open the batch export dialog from the main toolbar
  6. Select the folder containing your source files and the HTML files which were created in step 4
  7. Tweak the export settings as desired, and hit Apply to proceed with the export.

Feel free to reopen if you have any additional questions.

jacob-44 commented 5 months ago

Hi @manisandro , thank you for the instructions, very helpful. However after reaching point 6 I hit the problem: pop-up windows saying "Unable to open files" showing list of the HTML files. After that app shuts down.

jacob-44 commented 4 months ago

Not able to re-open....