tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
59.53k stars 9.23k forks source link

Can not read input file in /tmp #4233

Closed jasonparallel closed 2 months ago

jasonparallel commented 2 months ago

Current Behavior

On OSX tesseract /tmp/Image-1.png output yields Leptonica Error in findFileFormat: image file not found: /tmp/Image-1.png While tesseract /private/tmp/Image-1.png output Works without issue

/tmp is a symlink to /private/tmp

Expected Behavior

Symlink paths to be read

Suggested Fix

No response

tesseract -v

tesseract 5.3.4 leptonica-1.84.1 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.12 : libwebp 1.4.0 : libopenjp2 2.5.2 Found AVX2 Found AVX Found FMA Found SSE4.1 Found libarchive 3.7.4 zlib/1.2.12 liblzma/5.4.6 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.6 Found libcurl/8.4.0 SecureTransport (LibreSSL/3.3.6) zlib/1.2.12 nghttp2/1.55.1

Operating System

macOS 14 Sonoma

Other Operating System

No response

uname -a

No response

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

No response

stweil commented 2 months ago

That's a known issue caused by the special handling of /tmp in Leptonica. It also happens when that is a directory, so it does not depend on symbolic links.

stweil commented 2 months ago

Workaround: use //tmp (or /Tmp on macOS which is case insensitive) instead of /tmp. That circumvents the special handling.

amitdo commented 2 months ago

That's a known issue caused by the special handling of /tmp in Leptonica.

We can't do anything about it.