quickemu-project / quicktest

Quickly and automatically test systems inside Quickemu virtual machines 🧑‍🔬
MIT License
16 stars 3 forks source link

bug: Tesseract command line options only work on 5.x #20

Open popey opened 1 month ago

popey commented 1 month ago

Expected behavior

Quicktest should work on Ubuntu 22.04.

Actual behavior

Tesseract fails with Error, unknown command line argument '--loglevel'

Workaround:

Unset the tesseract options environment variable or set it to something else. TESSERACT_OCR_OPTIONS="--oem 0" ./quicktest

Steps to reproduce the behavior

Run Quicktest on Ubuntu 22.04

Additional context

We should probably either detect the version of tesseract and bail if less than 5.x, or perferably, tweak settings depending on the version of tesseract installed. Ideally people should be able to run this on any release.

philclifford commented 1 month ago

For me, with the above TESSERACT_OCR_OPTIONS="--oem 0" workaround on 22.04 I then get

Error: Tesseract (legacy) engine requested, but components are not present in /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata!!
Failed loading language 'eng'
Tesseract couldn't load any languages!

but

List of available languages (4):
cym
eng
gla
osd

so resorted to looking minimally at the --help-extra and tried --oem 3 , which got further but failed to PASS (the requested text was there but not found). :stop_sign:

UPDATE

[20240517-150932] 📁 /home/phil/src/quicktest/results/alpine/v3.19/test_boot_to_login/20240517-150833
[20240517-150932] 🎉 Test passed: test_boot_to_login

The solution was to grab the best tessdata eng file.
:fireworks: