Open cchadowitz opened 3 months ago
In addition, this is the version of liblept5
that is installed in a clean Ubuntu 24.04 container when installing tesseract-ocr
(there is no newer version of liblept5
available from the above tesseract daily-dev ppa):
# apt info liblept5
Package: liblept5
Version: 1.82.0-3build4
Priority: optional
Section: universe/libs
Source: leptonlib
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Jeff Breidenbach <jab@debian.org>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 2726 kB
Depends: libc6 (>= 2.33), libgif7 (>= 5.1), libjpeg8 (>= 8c), libopenjp2-7 (>= 2.0.0), libpng16-16t64 (>= 1.6.2), libtiff6 (>= 4.0.3), libwebp7 (>= 1.3.2), libwebpmux3 (>= 1.3.2), zlib1g (>= 1:1.1.4)
Breaks: libleptonica (>= 1.69~)
Replaces: libleptonica (>= 1.69~)
Homepage: http://www.leptonica.org
Task: kubuntu-desktop, kubuntu-full, ubuntustudio-video, ubuntustudio-graphics, ubuntustudio-publishing
Download-Size: 1099 kB
APT-Manual-Installed: no
APT-Sources: http://archive.ubuntu.com/ubuntu noble/universe amd64 Packages
Description: image processing library
Current Behavior
When running this command line:
The following occurs:
This is reproducible via the following sequence of commands (output is clipped for brevity until the end) to start a clean Ubuntu 24.04 docker container, update existing packages, install
tesseract-ocr
(for command line usage) and the two languages in question,tesseract-ocr-ara
andtesseract-ocr-chi-tra
. The test image is the same image in #4148,wget
is used to download it to test. It is also available in this ticket below.Backtrace:
This also occurred when using the latest package from the dail-dev ppa here, version included below.
The test image is included here again for reference.
Expected Behavior
As in #4148 and #4146, the expectation is that this combination of languages and image would not cause a sigabrt.
Suggested Fix
No known suggested fixes at this time.
tesseract -v
Current Ubuntu 24.04
tesseract-ocr
package:From the current latest package available from this daily-dev ppa:
Compiler
CPU
Virtualization / Containers
Other Information
I opened this new ticket even though this is closely related to #4146 and #4148 as this is entirely reproducible with the latest ubuntu packages for both the command line
tesseract
and the languages used. This implies that while the image may not have discernible text for the OCR process to function, it is still causing a sigabrt with a "standard configuration".