danpla / dpscreenocr

Program to recognize text on screen
https://danpla.github.io/dpscreenocr/
zlib License
231 stars 17 forks source link

Languages are not displayed after upgrading Ubuntu to 22.10 or later #36

Closed brandones closed 11 months ago

brandones commented 11 months ago

image

~ $  apt list --installed 'tesseract-ocr-*'                             15:12:52
Listing... Done
tesseract-ocr-eng/lunar,lunar,now 1:4.1.0-2 all [installed]
tesseract-ocr-osd/lunar,lunar,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr-spa/lunar,lunar,now 1:4.1.0-2 all [installed]
danpla commented 11 months ago

Hi,

Could you please show the output of tesseract --list-langs? Also, did you restart dpScreenOCR after installing the languages?

brandones commented 11 months ago
~ $ tesseract --list-langs                                              18:47:55
List of available languages in "/usr/share/tesseract-ocr/5/tessdata/" (3):
eng
osd
spa

And yes, I did. Checked with

~ $ ps aux | grep ocr                                                   18:48:10
brandon   854893  0.0  0.0  14436  2408 pts/1    S+   18:48   0:00 grep --color=auto ocr
~ $                                                                     18:48:22
danpla commented 11 months ago

I tried both v1.3.0 and the current development version on Ubuntu 23.04 (Lunar Lobster), and both work without problems. But I built them manually as they are not available in PPA, so the question now is how dpScreenOCR was installed on your machine?

brandones commented 11 months ago

Yeah I have it installed via apt.

~ $ grep ^ /etc/apt/sources.list /etc/apt/sources.list.d/* | grep dpscreen                                                                     10:50:45
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list:# deb https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ kinetic main # disabled on upgrade to kinetic
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list:# deb-src https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ jammy main
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list.distUpgrade:# deb https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ kinetic main # disabled on upgrade to kinetic
/etc/apt/sources.list.d/daniel_p-ubuntu-dpscreenocr-jammy.list.distUpgrade:# deb-src https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu/ jammy main
~ $ apt list dpscreenocr                                                10:47:53
Listing... Done
dpscreenocr/now 1.3.0-1~jammy1 amd64 [installed,local]

I'm on 23.04 (Lunar).

Thanks for your support with this. Other ideas for how to debug?

danpla commented 11 months ago

So the problem appeared after upgrading Ubuntu from 22.04 (Jammy) to 23.04, and the program worked fine on 22.04, right?

I guess I know what happened. Can you please show the output of ldd `which dpscreenocr` | grep tesseract?

brandones commented 11 months ago

Yeah, that's probably the case.

~ $ ldd $(which dpscreenocr) | grep tesseract                                                       17:45:29
        libtesseract.so.4 => /lib/x86_64-linux-gnu/libtesseract.so.4 (0x00007f589be00000)
danpla commented 11 months ago

Ubuntu 23.04 is shipped with Tesseract 5, while 22.04 uses Tesseract 4. During upgrade 22.04 to 23.04, Tesseract 4 was kept as the dependency of dpScreenOCR, but language packages were upgraded as they are not dependencies from the package manager's view. Tesseract 4 still tries to find them in /usr/share/tesseract-ocr/4.00/tessdata/, while they are now in /usr/share/tesseract-ocr/5/tessdata.

Long story short, the PPA now has a build for Lunar. I'm not sure if it will show up as a package upgrade since the package version is the same as in 22.04, so you may need to explicitly reinstall it, e.g.:

sudo apt update
sudo apt remove dpscreenocr
sudo apt install dpscreenocr
brandones commented 11 months ago

Had to re-add the PPA, but that did it! Thanks so much for your help @danpla . It's a great tool and I appreciate your work on it.

image

danpla commented 11 months ago

You're welcome!