tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
61.31k stars 9.41k forks source link

Build system drags in "avx512" support on setup that does not support this instructions #4254

Open TurtleWilly opened 4 months ago

TurtleWilly commented 4 months ago

Current Behavior

I'm building the current release of tesseract (5.3.4) on macOS with gcc 9.2 on a machine that does not have the avx512 instruction set (Haswell CPU).

$ ./configure \
    CC='/usr/local/gcc9/bin/gcc-9.2'           \
    CXX='/usr/local/gcc9/bin/g++-9.2'          \
    CFLAGS='  -O3 -march=native -mtune=native' \
    CXXFLAGS='-O3 -march=native -mtune=native'
checking whether the C++ compiler works... yes
…
checking whether C++ compiler accepts -Werror=unused-command-line-argument... no
checking whether C++ compiler accepts -mavx... yes
checking whether C++ compiler accepts -mavx2... yes
checking whether C++ compiler accepts -mavx512f... yes
checking whether C++ compiler accepts -mfma... yes
checking whether C++ compiler accepts -msse4.1... yes
checking for feenableexcept... no
…

$ make
Making all in .
…
libtool: compile: /usr/local/gcc9/bin/gcc-9.2 -DHAVE_CONFIG_H -DHAVE_FRAMEWORK_ACCELERATE -O2 \
    -DNDEBUG -I./include -I./include -I/usr/local/silo/leptonica/latest/include/leptonica     \
    -I/usr/local/silo/libarchive/latest/include -mavx512f -I./src/ccutil -O3 -march=native    \
    -mtune=native -std=c++17 -MT src/arch/libtesseract_avx512_la-dotproductavx512.lo -MD -MP  \
    -MF src/arch/.deps/libtesseract_avx512_la-dotproductavx512.Tpo -c                         \
    src/arch/dotproductavx512.cpp -fno-common -DPIC -o                                        \
    src/arch/.libs/libtesseract_avx512_la-dotproductavx512.o
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//cctLcYFO.s:40:2: error: instruction requires: AVX-512 ISA
        vmovups (%rdi,%r10), %zmm1
        ^
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//cctLcYFO.s:41:2: error: instruction requires: AVX-512 ISA
        vfmadd231ps     (%rsi,%r10), %zmm1, %zmm0
        ^
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//cctLcYFO.s:44:2: error: instruction requires: AVX-512 ISA
        vmovups (%rdi,%r10), %zmm5
        ^
# (lots of more errors like this)
…

Note the checking whether C++ compiler accepts -mavx512f... yes which seems to only check if the compiler accepts the argument (which it does), but not if it could create workable code? I had to manually patch this check in the configure script to make things build.

Expected Behavior

Configure script properly detects that "avx512" can't be used, build completes w/o trouble afterwards.

Suggested Fix

I don't know. I'm not experienced enough with tesseract's build system. Maybe a --disable-avx512 switch in the configure script could work, if automatic detection isn't suitable.

tesseract -v

n/a (build failed)

Operating System

No response

Other Operating System

macOS

uname -a

No response

Compiler

gcc 9.2

CPU

No response

Virtualization / Containers

No response

Other Information

No response

stweil commented 4 months ago

Setting CC and CFLAGS has no effect (both are not used).

Does the build work if you don't set CXXFLAGS? Or with another compiler (newer gcc or Apple's clang)? Normally the compilers have no problem with building AVX512 code if they accept -mavx512f.

TurtleWilly commented 4 months ago

No CXXFLAGS, just CXX, fails too:

libtool: compile:  /usr/local/gcc9/bin/g++-9.2 -DHAVE_CONFIG_H -DHAVE_FRAMEWORK_ACCELERATE -O2 -DNDEBUG -I./include -I./include -I/usr/local/silo/leptonica/latest/include/leptonica -I/usr/local/silo/libarchive/latest/include -mavx512f -I./src/ccutil -g -O2 -std=c++17 -MT src/arch/libtesseract_avx512_la-dotproductavx512.lo -MD -MP -MF src/arch/.deps/libtesseract_avx512_la-dotproductavx512.Tpo -c src/arch/dotproductavx512.cpp  -fno-common -DPIC -o src/arch/.libs/libtesseract_avx512_la-dotproductavx512.o
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//ccMgthl0.s:70:2: error: instruction requires: AVX-512 ISA
        vmovups (%rdi,%rax), %zmm2
        ^
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//ccMgthl0.s:71:2: error: instruction requires: AVX-512 ISA
        vfmadd231ps     (%rsi,%rax), %zmm2, %zmm0
        ^
/var/folders/b7/ss_s7c792_54wsn_vjk3jrym0000gn/T//ccMgthl0.s:105:2: error: instruction requires: AVX-512 ISA
        vextractf64x4   $0x1, %zmm0, %ymm1
        ^
make[1]: *** [src/arch/libtesseract_avx512_la-dotproductavx512.lo] Error 1
make: *** [all-recursive] Error 1

My other compiler is clang (Apple/Xcode), but it's too old for tesseract's modern (erm… or "no-backwards-compatibility" 😎 ) build requirements and configure aborts because the missing C++17 support. Hence I had to switch over to gcc 9.2 in the first place (usually clang is my default compiler).