tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
61.13k stars 9.4k forks source link

Described development build fails #3772

Closed mbrunecky closed 2 years ago

mbrunecky commented 2 years ago

Current Behavior:

Following the basic instructions for creating a tesseract DevStdio solution: git clone https://github.com/tesseract-ocr/tesseract tesseract cd tesseract mkdir build && cd build cmake .. 2>>&1 > cmake_build.log

The cmake run crashes, generating lots of errors

Expected Behavior:

I expect to get a tesseract .sln file somewehere

Suggested Fix:

I realize this is not a tesseract bug, but rather some omission in the documentation. or a 'sw' problem. But I need to develop/debug tesseract, not the sw (nor cmake), so maybe someone can shed some light on it.

This is what my output looks like: C:\Work\tesseract\build> cmake .. 2>>&1 > cmake_build.log Configuring tesseract version 5.1.0-13-g3c22... CMAKE_SYSTEM_PROCESSOR= [1/176] [pub.egorpugin.primitives.command-master]/src/argument.cpp [2/176] [pub.egorpugin.libuv-1.42.0]/src/win/tty.c [3/176] [pub.egorpugin.libuv-1.42.0]/src/win/poll.c [4/176] [pub.egorpugin.primitives.command-master]/src/uv_command.cpp [10/176] [pub.egorpugin.libuv-1.42.0]/src/timer.c [11/176] [pub.egorpugin.libuv-1.42.0]/src/win/snprintf.c [13/176] [pub.egorpugin.libuv-1.42.0]/src/win/udp.c [16/176] [pub.egorpugin.libuv-1.42.0]/src/win/core.c [20/176] [pub.egorpugin.libuv-1.42.0]/src/uv-common.c [24/176] [org.sw.demo.boost.filesystem-1.78.0]/src/directory.cpp [25/176] [pub.egorpugin.libuv-1.42.0]/src/win/getaddrinfo.c [26/176] [org.sw.demo.jbeder.yaml_cpp-master]/src/tag.cpp [28/176] [pub.egorpugin.primitives.yaml-master]/[sw.rc] [30/176] [pub.egorpugin.libuv-1.42.0]/src/win/process-stdio.c [31/176] [pub.egorpugin.libuv-1.42.0]/src/win/handle.c [32/176] [pub.egorpugin.libuv-1.42.0]/src/win/detect-wakeup.c [34/176] [org.sw.demo.boost.thread-1.78.0]/[sw.rc] [36/176] [org.sw.demo.boost.filesystem-1.78.0]/src/portability.cpp [38/176] [pub.egorpugin.libuv-1.42.0]/src/threadpool.c [40/176] [pub.egorpugin.libuv-1.42.0]/src/win/pipe.c Exception in file D:/dev/cppan2/client2/src/sw/builder/command.cpp:825, function execute1: When executing: [pub.egorpugin.primitives.command-master]/src/uv_command.cpp C:\Users\mbrunecky.sw\storage\pkg\e2\99\efe6\src\sdir\src\string\include\primitives/string.h(42): error C2039: 'u8string': is not a member of 'std' C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.12.25827\include\unordered_map(15): note: see declaration of 'std' ...

my cmake_builld.log reports: -- Building for: Visual Studio 15 2017 -- Setting policy CMP0091 to NEW -- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.19042. -- The C compiler identification is MSVC 19.12.25835.0 -- The CXX compiler identification is MSVC 19.12.25835.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/VC/Tools/MSVC/14.12.25827/bin/Hostx86/x86/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/VC/Tools/MSVC/14.12.25827/bin/Hostx86/x86/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Setting build type to 'Release' as none was specified. -- IPO / LTO supported -- Performing Test COMPILER_SUPPORTS_MARCH_NATIVE -- Performing Test COMPILER_SUPPORTS_MARCH_NATIVE - Failed -- Found SW: C:/Program Files/CMake/bin/sw.exe
-- sw: processing dependencies -- Configuring incomplete, errors occurred! See also "C:/Work/tesseract/build/CMakeFiles/CMakeOutput.log". See also "C:/Work/tesseract/build/CMakeFiles/CMakeError.log".

zdenop commented 2 years ago

Which the basic instructions you follow?

mbrunecky commented 2 years ago

Thank you. I am following instructions at: https://tesseract-ocr.github.io/tessdoc/Compiling.html#windows About 1/3 page down, under heading: Develop Tesseract For development purposes of Tesseract itself do the next steps: git clone https://github.com/tesseract-ocr/tesseract tesseract cd tesseract mkdir build && cd build cmake ..

I suspect the problem is the sw (some incompatibility with my DevStudio), and (assuming I will not be able to fix it) it would help me having a 'sample' .sln, .proj files that would allow me to build a 'solution' of my own. I am hoping to build a DLL exposing my interfaces, and internally accessing tesseract 'private' data.

Shreeshrii commented 2 years ago

Please see https://github.com/tesseract-ocr/tesseract/tree/main/.github/workflows which has cmake and vcpkg based workflows for Windows build.

zdenop commented 2 years ago

For that steps, you need to use the working sw tool/project. Otherwise, you need to use another solutions mentioned by @Shreeshrii

mbrunecky commented 2 years ago

Thank you. I see the 'workflows' in my repository, so I will try to learn how to use vcpkg. The problem with ws is aparently that my DevStudio 17, MSVC 14.16 does not recognize std:u8string.

zdenop commented 2 years ago

std::u8string is part of C++20 which is sad as tesseract need only c++17.

@egorpugin: is it possible to use sw with the c++17 compilers?

egorpugin commented 2 years ago

Hi,

Sw uses the latest C++ for build scripts. You need the latest VS compiler (VS2019/2022 or light VS 2019/2022 build tools distro installed).

mbrunecky commented 2 years ago

Apparently I need more help. Unfortunately I am not well versed in the tools used here, and without that knowledge this is an uphill battle. I am trying to follow the tesseract\workflow\vcpkg.yml from the command line. I installed leptonica (and others): C:\Work\tesseract>vcpkg\vcpkg install leptonica:x64-windows

and it shows installed:

C:\Work\tesseract>vcpkg\vcpkg list
...
leptonica:x64-windows                              1.81.1#1
...

But the next step:

C:\Work\tesseract>cmake . -B build -DCMAKE_BUILD_TYPE=Release -DSW_BUILD=OFF -DOPENMP_BUILD=OFF -DBUILD_TRAINING_TOOLS=OFF "-DCMAKE_TOOLCHAIN_FILE=${env:GITHUB_WORKSPACE}/vcpkg/scripts/buildsystems/vcpkg.cmake"

tells me:
-- Setting policy CMP0091 to NEW
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.19042.
Configuring tesseract version 5.1.0-13-g3c22...
-- IPO / LTO supported
CMAKE_SYSTEM_PROCESSOR=<AMD64>
-- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
-- Could NOT find Leptonica (missing: Leptonica_DIR)
CMake Error at CMakeLists.txt:364 (message):
  Cannot find required library Leptonica.  Quitting!

Obviously, I have no idea what I am doing...
zdenop commented 2 years ago

Do not mix described build ways (unless you know what are you doing and how to set your environment/fix the problem). Use vcpkg also tesseract compilation or build all dependencies with cmake (it is not difficult and there are tutorial on internet how to do it, or follow cmake workflow as suggest by Shreeshrii).

amitdo commented 2 years ago

Please use our forum for asking questions.