Closed ChristianOsta closed 1 month ago
Unrelated:
--psm 7
won't work for a rotated line image. That requires --psm 1
(or no argument for page segmentation mode).
@ChristianOsta, if you want you can try and review the pull request #4243 which fixes the issue.
Notably, this issue does not occur under Windows.
FP exceptions are enabled conditionally in main()
. Therefore this exception is not thrown on macOS (with clang compiler) and on Windows (compiler without HAVE_FEENABLEEXCEPT
).
The fix was pushed to the main
branch.
Current Behavior
The image below causes a floating-point exception (SIGFPE) under ubuntu (WSL) when using the legacy model with psm_mode = 7 due to an invalid input to the asinf function. The exception is triggered when the input to asinf is slightly out of its valid range, specifically -1.00000012. This results in a program termination with a SIGFPE error. Notably, this issue does not occur under Windows.
Backtrace: The backtrace indicates that the error originates from the tesseract::Wordrec::angle_change function: -> see "other information"
tesseract command: tesseract.exe -l eng+deu "tesseract_fail.png" stdout --tessdata-dir "" --oem 0 --psm 7
i used the legacy models for english and german from tesseract-ocr/tessdata
interestingly, when moving the single "d" in the bottom part of the image one pixel up or to the right the exception will not be thrown anymore.
I will gladly provide additional information if needed.
image to reproduce the behavior:
Expected Behavior
Tesseract should handle the input gracefully without causing a floating-point exception.
Suggested Fix
No response
tesseract -v
tesseract 5.3.4 leptonica-1.83.1 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.13 : libwebp 1.4.0 : libopenjp2 2.5.2 Found AVX512BW Found AVX512F Found AVX512VNNI Found AVX2 Found AVX Found FMA Found SSE4.1 Found OpenMP 201511 Found libarchive 3.7.2 zlib/1.2.13 liblzma/5.2.6 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.5.5
Operating System
No response
Other Operating System
Ubuntu inside Windows Subsystem for Linux (WSL)
Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy
uname -a
Linux 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Compiler
No response
CPU
No response
Virtualization / Containers
No response
Other Information