Closed brunoais closed 3 years ago
Can you send an example image causing the crash? You can enable the ocr_dump_debug_image
option as described in the manual.
I think it's crashing no matter what image I try.
Now I went to check, seems like the tesseract-ocr
package was updated to 4.1.1
and the manual states in the information for windows that anything besides tesseract 4.0
will make it crash.
Can that be the cause of the crash? tesseract updated to 4.1.X
?
I just confirmed it was the automatic update to 4.1.X. Is there a way to get an updated version that works with 4.1.X? Maybe custom compiling it?
How did you get 4.1.1 on Ubuntu 18.04? The official Ubuntu 18.04 ppa provides tesseract 4.0.0: https://packages.ubuntu.com/search?keywords=tesseract&searchon=names&suite=all§ion=all
I tried both tesseract versions (4.00 and 4.1.1) on Xubuntu 18.04 and 20.04 respectively; the program worked without issues. I'll also try on Ubuntu soon.
Is there a chance you are using Wayland?
However, no, that's definitely not Wayland, otherwise even the hotkey would not work and you would not be able to capture an image.
You mentioned an update to 4.1.1. Does this mean that the program worked fine before the update?
It did. And it works again if I downgrade the version, which I did. Now it works again as I downgraded back to 4.0.0. I'm using ZorinOS, which is mostly ubuntu with lots of ppa built-in (although most not installed; just available without me having to add them myself. dockbarx is installed, for example), and many customizations in looks.
It may be that the dpscreenocr ppa only provides a version (on my build) for 4.0.0 but, for ubuntu 20.04, provides a build for 4.1.1.
Which ppa the version 4.1.1 comes from? You can use apt-cache policy tesseract-ocr
to see that.
ppa:alex-p/tesseract-ocr
Yes, it turns out that dpScreenOCR compiled with libtesseract 4.0 doesn't work with 4.1.1 and vice versa.
The problem is in ABI incompatibility between the versions, namely the ETEXT_DESC class used to monitor OCR progress: https://github.com/tesseract-ocr/tesseract/blob/b19e3ee63c4afe207676e3e1b3211f52909f8d48/include/tesseract/ocrclass.h#L99 Tesseract 4.1.1 expects ETEXT_DESC from 4.1.1, but gets the one from 4.0 and crashes.
One possible solution would be using C tesseract API, but tesseract from Ubuntu 18.04 (version 4.00~git2288-10f4998a-2) is older than the 4.0.0 release and doesn't include the C API for progress monitoring: https://github.com/tesseract-ocr/tesseract/commit/87d33b6c9ed87699ff0d6588279d8268f5df6db3
So, unfortunately, this problem is impossible to fix. You should either use the default tesseract version or compile dpScreenOCR manually so it works with 4.1.1.
OK. Understood. Thank you.
When I try to translate some text on the screen,
dpscreenocr
is crashing withI don't know if the warning is meaningful, btw. Any idea how to diagnose this better? I'm running Ubuntu 18.04 and
dpscreenocr
1.0.2 from the ppa.