tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
61.79k stars 9.46k forks source link

Error while running of Red Hat- DotProductAVX can't be used on Android #1113

Closed AzkaGilani closed 5 years ago

AzkaGilani commented 7 years ago

I have used tesseract in my project. The project has been deployed on Red Hat server. It comiples fines on ubuntu and when i compile the same program in Red Hat it gives run time error: "DotProductAVX can't be used on Android." I have found other issues relating on internet similar to this, but the problem is that i can't update tesseract version or can not even rebuild it. Is there any other solution by which i can solve this?? Should i make some change in my C++ code?

tesseract --version gives the following output: `tesseract 0afd593 leptonica-1.74 libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8

Found AVX Found SSE `

stweil commented 7 years ago

So the build process did not add the code needed for AVX => arch/dotproductavx.cpp was compiled without __AVX__ being defined => The compiler was called without -mavx for that file. As you did not tell us more about your build, you are currently the only one who can solve that problem. A full build log (commands used, output from configure / make) would help.

stweil commented 7 years ago

The Tesseract version in the output from tesseract --version looks strange. Which code did you compile?

AzkaGilani commented 7 years ago

I have compiled tesseract from source. Build is tesseract-4.0Alpha while (0afd593) is commit number on this build.

AzkaGilani commented 7 years ago

I used following commands to build the project. It is working fine on ubuntu 16.04 but fails on Red Hat when tesseract function is called.

`sudo apt-get install autoconf automake libtool sudo apt-get install autoconf-archive sudo apt-get install pkg-config sudo apt-get install libpng12-dev sudo apt-get install libjpeg8-dev sudo apt-get install libtiff5-dev sudo apt-get install zlib1g-dev

wget http://www.leptonica.org/source/leptonica-1.74.tar.gz

tar -zxvf leptonica-1.74.tar.gz cd leptonica-1.74 ./configure make sudo make install

git clone git@gitlab.com:VisionX/core/libs/extern/tesseract.git cd tesseract git fetch origin git branch -v -a git checkout -b tesseract-4.0Alpha origin/tesseract-4.0Alpha git pull origin tesseract-4.0Alpha ./autogen.sh ./configure --enable-debug LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make sudo make install sudo ldconfig`

stweil commented 7 years ago

Could you please also add the output from the Tesseract configure / make?

AzkaGilani commented 7 years ago

~/tesseract$ ./configure --enable-debug checking for g++... g++ checking whether the C++ compiler works... yes checking for C++ compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes Using git revision: 0afd593 checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... no checking for mawk... mawk checking whether make sets $(MAKE)... yes checking for style of include used by make... GNU checking whether make supports nested variables... yes checking dependency style of g++... gcc3 checking whether to enable maintainer-specific portions of Makefiles... no checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking whether C++ compiler accepts -mavx... yes checking whether C++ compiler accepts -msse4.1... yes checking --enable-graphics argument... yes checking --enable-embedded argument... no checking for g++ option to support OpenMP... -fopenmp checking --enable-opencl argument... no checking how to run the C++ preprocessor... g++ -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking CL/cl.h usability... no checking CL/cl.h presence... no checking for CL/cl.h... no checking OpenCL/cl.h usability... no checking OpenCL/cl.h presence... no checking for OpenCL/cl.h... no checking tiffio.h usability... yes checking tiffio.h presence... yes checking for tiffio.h... yes checking for clGetPlatformIDs in -lOpenCL... no checking --enable-visibility argument... no checking --enable-multiple-libraries argument... no checking whether to use tessdata-prefix... yes checking whether to enable debugging... yes checking how to print strings... printf checking for gcc... gcc checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking dependency style of gcc... gcc3 checking for a sed that does not truncate output... /bin/sed checking for fgrep... /bin/grep -F checking for ld used by gcc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B checking the name lister (/usr/bin/nm -B) interface... BSD nm checking whether ln -s works... yes checking the maximum length of command line arguments... 1572864 checking how to convert x86_64-pc-linux-gnu file names to x86_64-pc-linux-gnu format... func_convert_file_noop checking how to convert x86_64-pc-linux-gnu file names to toolchain format... func_convert_file_noop checking for /usr/bin/ld option to reload object files... -r checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for dlltool... no checking how to associate runtime and link libraries... printf %s\n checking for ar... ar checking for archiver @FILE support... @ checking for strip... strip checking for ranlib... ranlib checking command to parse /usr/bin/nm -B output from gcc object... ok checking for sysroot... no checking for a working dd... /bin/dd checking how to truncate binary pipes... /bin/dd bs=4096 count=1 checking for mt... mt checking if mt is a manifest tool... no checking for dlfcn.h... yes checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... yes checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes checking how to run the C++ preprocessor... g++ -E checking for ld used by g++... /usr/bin/ld -m elf_x86_64 checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking for g++ option to produce PIC... -fPIC -DPIC checking if g++ PIC flag -fPIC -DPIC works... yes checking if g++ static flag -static works... yes checking if g++ supports -c -o file.o... yes checking if g++ supports -c -o file.o... (cached) yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking dynamic linker characteristics... (cached) GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether byte ordering is bigendian... no checking if compiling with clang... no checking whether compiler supports C++11... yes checking for snprintf... yes checking for library containing sem_init... -lpthread checking for ANSI C header files... (cached) yes checking whether time.h and sys/time.h may both be included... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking sys/ipc.h usability... yes checking sys/ipc.h presence... yes checking for sys/ipc.h... yes checking sys/shm.h usability... yes checking sys/shm.h presence... yes checking for sys/shm.h... yes checking limits.h usability... yes checking limits.h presence... yes checking for limits.h... yes checking malloc.h usability... yes checking malloc.h presence... yes checking for malloc.h... yes checking for stdbool.h that conforms to C99... no checking for _Bool... no checking whether #! works in shell scripts... yes checking for special C compiler options needed for large files... no checking for _FILE_OFFSET_BITS value needed for large files... no checking for getline... yes checking for wchar_t... yes checking for long long int... yes checking for off_t... yes checking for mbstate_t... yes checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for LEPTONICA... yes checking for ICU_UC... yes checking for ICU_I18N... yes checking for pango... yes checking for cairo... yes checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating tesseract.pc config.status: creating api/Makefile config.status: creating arch/Makefile config.status: creating ccmain/Makefile config.status: creating opencl/Makefile config.status: creating ccstruct/Makefile config.status: creating ccutil/Makefile config.status: creating classify/Makefile config.status: creating cutil/Makefile config.status: creating dict/Makefile config.status: creating lstm/Makefile config.status: creating textord/Makefile config.status: creating viewer/Makefile config.status: creating wordrec/Makefile config.status: creating tessdata/Makefile config.status: creating tessdata/configs/Makefile config.status: creating tessdata/tessconfigs/Makefile config.status: creating testing/Makefile config.status: creating java/Makefile config.status: creating java/com/Makefile config.status: creating java/com/google/Makefile config.status: creating java/com/google/scrollview/Makefile config.status: creating java/com/google/scrollview/events/Makefile config.status: creating java/com/google/scrollview/ui/Makefile config.status: creating doc/Makefile config.status: creating training/Makefile config.status: creating config_auto.h config.status: config_auto.h is unchanged config.status: executing depfiles commands config.status: executing libtool commands

Configuration is done. You can now build and install tesseract by running:

$ make $ sudo make install

Training tools can be build and installed (after building of tesseract) with:

$ make training $ sudo make training-install

stweil commented 7 years ago

checking whether C++ compiler accepts -mavx... yes

So your compiler supports -mavx. Is it used for arch/dotproductavx.cpp (see output from make)?

AzkaGilani commented 7 years ago

Tesseract was built successfully. When i try to again make it, i get the following output.

~/tesseract$ LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make make all-recursive make[1]: Entering directory '/home/azka/tesseract' Making all in arch make[2]: Entering directory '/home/azka/tesseract/arch' make[3]: Entering directory '/home/azka/tesseract/arch' make[3]: Nothing to be done for 'all-am'. make[3]: Leaving directory '/home/azka/tesseract/arch' make[2]: Leaving directory '/home/azka/tesseract/arch' Making all in ccutil make[2]: Entering directory '/home/azka/tesseract/ccutil' make[3]: Entering directory '/home/azka/tesseract/ccutil' make[3]: Nothing to be done for 'all-am'. make[3]: Leaving directory '/home/azka/tesseract/ccutil' make[2]: Leaving directory '/home/azka/tesseract/ccutil' Making all in viewer make[2]: Entering directory '/home/azka/tesseract/viewer' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/viewer' Making all in cutil make[2]: Entering directory '/home/azka/tesseract/cutil' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/cutil' Making all in opencl make[2]: Entering directory '/home/azka/tesseract/opencl' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/opencl' Making all in ccstruct make[2]: Entering directory '/home/azka/tesseract/ccstruct' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/ccstruct' Making all in dict make[2]: Entering directory '/home/azka/tesseract/dict' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/dict' Making all in classify make[2]: Entering directory '/home/azka/tesseract/classify' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/classify' Making all in wordrec make[2]: Entering directory '/home/azka/tesseract/wordrec' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/wordrec' Making all in textord make[2]: Entering directory '/home/azka/tesseract/textord' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/textord' Making all in lstm make[2]: Entering directory '/home/azka/tesseract/lstm' make[3]: Entering directory '/home/azka/tesseract/lstm' make[3]: Nothing to be done for 'all-am'. make[3]: Leaving directory '/home/azka/tesseract/lstm' make[2]: Leaving directory '/home/azka/tesseract/lstm' Making all in ccmain make[2]: Entering directory '/home/azka/tesseract/ccmain' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/ccmain' Making all in api make[2]: Entering directory '/home/azka/tesseract/api' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/api' Making all in . make[2]: Entering directory '/home/azka/tesseract' make[2]: Leaving directory '/home/azka/tesseract' Making all in tessdata make[2]: Entering directory '/home/azka/tesseract/tessdata' Making all in configs make[3]: Entering directory '/home/azka/tesseract/tessdata/configs' make[3]: Nothing to be done for 'all'. make[3]: Leaving directory '/home/azka/tesseract/tessdata/configs' Making all in tessconfigs make[3]: Entering directory '/home/azka/tesseract/tessdata/tessconfigs' make[3]: Nothing to be done for 'all'. make[3]: Leaving directory '/home/azka/tesseract/tessdata/tessconfigs' make[3]: Entering directory '/home/azka/tesseract/tessdata' make[3]: Nothing to be done for 'all-am'. make[3]: Leaving directory '/home/azka/tesseract/tessdata' make[2]: Leaving directory '/home/azka/tesseract/tessdata' Making all in doc make[2]: Entering directory '/home/azka/tesseract/doc' make[2]: Nothing to be done for 'all'. make[2]: Leaving directory '/home/azka/tesseract/doc' make[1]: Leaving directory '/home/azka/tesseract'

stweil commented 7 years ago

Run touch arch/dotproductavx.cpp before running make. Then that file will be compiled again.

AzkaGilani commented 7 years ago

Actually i debuged my code and came to know that, program is throwing this error on this particular line api->GetUTF8Text()

Any idea about that>?

stweil commented 7 years ago

That's expected. Could you please try to get more information from the build (see my previous comment)?

AzkaGilani commented 7 years ago

Yes it is taking time. i will share the result shortly

AzkaGilani commented 7 years ago

TesseractBuildInformation.TXT This is the build information. Is there any other information that is required from my side? :)

AzkaGilani commented 7 years ago

Output after using sudo make install Tesseract-InstallOutput.TXT

stweil commented 7 years ago

Thank you. TesseractBuildInformation.TXT is sufficient. It shows that the compiler option -mavx was not used. That explains the problem, but it is still not clear why that option is missing.

I suggest to rebuild with latest sources from git. Don't forget to run ./autogen.sh before starting a new build.

AzkaGilani commented 7 years ago

Actually it is built on my client server and i don't have rights to build the Tesseract library again. Is there any other work around??

AzkaGilani commented 7 years ago

After runningtouch arch/dotproductavx.cpp and then making it. My program is now giving this error:

/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /usr/local/lib/libtesseract.so.4)

zdenop commented 6 years ago

@AzkaGilani : did you solve this?

willus commented 5 years ago

There is a simple fix to this whole thing. Rather than having the Tesseract code abort if AVX is detected but not compiled in, put the directives in the compiled code so that it is never detected if it is not compiled in. Not sure why the source code wasn't written this way. Here is how the simddetect.cpp module should look:

`/////////////////////////////////////////////////////////////////////// // File: simddetect.cpp // Description: Architecture detector. // Author: Stefan Weil (based on code from Ray Smith) // // (C) Copyright 2014, Google Inc. // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // http://www.apache.org/licenses/LICENSE-2.0 // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. ///////////////////////////////////////////////////////////////////////

include "simddetect.h"

undef X86_BUILD

if defined(__x86_64) || defined(i386__) || defined(_WIN32)

if !defined(ANDROID_BUILD)

define X86_BUILD 1

endif // !ANDROID_BUILD

endif // x86 target

if defined(X86_BUILD)

if defined(GNUC)

include

elif defined(_WIN32)

include

endif

endif

SIMDDetect SIMDDetect::detector;

// If true, then AVX has been detected. bool SIMDDetect::avxavailable; bool SIMDDetect::avx2available; bool SIMDDetect::avx512Favailable; bool SIMDDetect::avx512BWavailable; // If true, then SSe4.1 has been detected. bool SIMDDetect::sseavailable;

// Constructor. // Tests the architecture in a system-dependent way to detect AVX, SSE and // any other available SIMD equipment. // GNUC is also defined by compilers that include GNU extensions such as // clang. SIMDDetect::SIMDDetect() {

if defined(X86_BUILD)

if defined(GNUC)

unsigned int eax, ebx, ecx, edx; if (__get_cpuid(1, &eax, &ebx, &ecx, &edx) != 0) { // Note that these tests all use hex because the older compilers don't have // the newer flags.

ifdef __SSE4_1__

sse_available_ = (ecx & 0x00080000) != 0;

else

sse_available_ = false;

endif

ifdef AVX

avx_available_ = (ecx & 0x10000000) != 0;

else

avx_available_ = false;

endif

if (avx_available_) {
  // There is supposed to be a __get_cpuid_count function, but this is all
  // there is in my cpuid.h. It is a macro for an asm statement and cannot
  // be used inside an if.
  __cpuid_count(7, 0, eax, ebx, ecx, edx);

ifdef AVX2

  avx2_available_ = (ebx & 0x00000020) != 0;

else

  avx2_available_ = false;

endif

  avx512F_available_ = (ebx & 0x00010000) != 0;
  avx512BW_available_ = (ebx & 0x40000000) != 0;
}

}

elif defined(_WIN32)

int cpuInfo[4]; cpuid(cpuInfo, 0); if (cpuInfo[0] >= 1) { cpuid(cpuInfo, 1);

ifdef __SSE4_1__

sse_available_ = (cpuInfo[2] & 0x00080000) != 0;

else

sse_available_ = false;

endif

ifdef AVX

avx_available_ = (cpuInfo[2] & 0x10000000) != 0;

else

avx_available_ = false;

endif

ifndef AVX2

avx2_available_ = false;

endif

}

else

error "I don't know how to test for SIMD with this compiler"

endif

endif // X86_BUILD

} `

stweil commented 5 years ago

@willus, that's correct and can be even simplified further, but needs a little bit more code to set AVX, AVX2 and SSE4_1. Would https://github.com/stweil/tesseract/tree/simdetect be fine?

stweil commented 5 years ago

I have now created a pull request (#2135) to address this issue.

willus commented 5 years ago

@stweil, thank you. I agree that it could be simplified, and I re-did my version a few minutes later almost exactly the same way. I don't know C++ well enough--do declared booleans automatically default to false? I don't see where the avxavailable boolean, for example, is initially declared as false.

stweil commented 5 years ago

@AzkaGilani, I assume that the issue is solved in Git master, so I close this issue. Please report if there is still a problem.

stweil commented 5 years ago

do declared booleans automatically default to false?

Yes, global variables are set automatically to 0 or false if they did not get an explicit initial value.