Closed FlyingFathead closed 2 years ago
Hi, you need to recompile gImageReader against tesseract 5 - gImageReader will always display the tesseract version it was compiled against.
Hi, you need to recompile gImageReader against tesseract 5 - gImageReader will always display the tesseract version it was compiled against.
Thanks for the quick reply & clarification! However, now I ran to another problem on recompiling in Ubuntu 21.10, the gtk build goes to 100% with occasional depreciation warnings, but when I get to 100%, this happens:
[100%] Linking CXX executable gimagereader-gtk
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Config.cc.o: in function `Config::getAvailableLanguages()':
Config.cc:(.text+0x2eaa): undefined reference to `tesseract::TessBaseAPI::GetAvailableLanguagesAsVector(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*) const'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Config.cc.o: in function `tesseract::TessBaseAPI::Init(char const*, char const*)':
Config.cc:(.text._ZN9tesseract11TessBaseAPI4InitEPKcS2_[_ZN9tesseract11TessBaseAPI4InitEPKcS2_]+0x43): undefined reference to `tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, bool)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognize(std::vector<int, std::allocator<int> > const&, bool)::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x2616): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeImage(Cairo::RefPtr<Cairo::ImageSurface> const&, Recognizer::OutputDestination)::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x316c): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeImage(Cairo::RefPtr<Cairo::ImageSurface> const&, Recognizer::OutputDestination)::{lambda()#2}::operator()() const':
Recognizer.cc:(.text+0x3219): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeBatch()::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x40ea): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/gimagereader.dir/build.make:718: gimagereader-gtk] Error 1
make[1]: *** [CMakeFiles/Makefile2:182: CMakeFiles/gimagereader.dir/all] Error 2
make: *** [Makefile:149: all] Error 2
I'm pretty sure I've successfully compiled gImageReader on my other Linux desktop machines running Ubuntu 64-bit (amd64). I tried googling for the error messages but found little to no information on what to do with those error messages. Tesseract is compiled from a git source and it works without any issues. All help is kindly appreciated. Thanks!
Probably -ltesseract
missing in your linking command, try make VERBOSE=1
to see the full commands and check whether tesseract appears as link library.
Probably
-ltesseract
missing in your linking command, trymake VERBOSE=1
to see the full commands and check whether tesseract appears as link library.
OK - I tried:
make VERBOSE=1
End result was this:
[100%] Linking CXX executable gimagereader-gtk
/usr/bin/cmake -E cmake_link_script CMakeFiles/gimagereader.dir/link.txt --verbose=1
/usr/bin/c++ -fopenmp CMakeFiles/gimagereader.dir/common/CCITTFax4Encoder.cc.o CMakeFiles/gimagereader.dir/common/PaperSize.cc.o CMakeFiles/gimagereader.dir/gtk/src/Acquirer.cc.o CMakeFiles/gimagereader.dir/gtk/src/Config.cc.o CMakeFiles/gimagereader.dir/gtk/src/ConfigSettings.cc.o CMakeFiles/gimagereader.dir/gtk/src/CrashHandler.cc.o CMakeFiles/gimagereader.dir/gtk/src/DisplayRenderer.cc.o CMakeFiles/gimagereader.dir/gtk/src/Displayer.cc.o CMakeFiles/gimagereader.dir/gtk/src/DisplayerToolSelect.cc.o CMakeFiles/gimagereader.dir/gtk/src/DjVuDocument.cc.o CMakeFiles/gimagereader.dir/gtk/src/FileDialogs.cc.o CMakeFiles/gimagereader.dir/gtk/src/FileTreeModel.cc.o CMakeFiles/gimagereader.dir/gtk/src/FontComboBox.cc.o CMakeFiles/gimagereader.dir/gtk/src/Image.cc.o CMakeFiles/gimagereader.dir/gtk/src/MainWindow.cc.o CMakeFiles/gimagereader.dir/gtk/src/OutputBuffer.cc.o CMakeFiles/gimagereader.dir/gtk/src/OutputEditorText.cc.o CMakeFiles/gimagereader.dir/gtk/src/RecognitionMenu.cc.o CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o CMakeFiles/gimagereader.dir/gtk/src/SearchReplaceFrame.cc.o CMakeFiles/gimagereader.dir/gtk/src/SourceManager.cc.o CMakeFiles/gimagereader.dir/gtk/src/SubstitutionsManager.cc.o CMakeFiles/gimagereader.dir/gtk/src/TessdataManager.cc.o CMakeFiles/gimagereader.dir/gtk/src/Utils.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/DisplayerToolHOCR.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRBatchExportDialog.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRDocument.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCROdtExporter.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRPdfExportWidget.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRPdfExporter.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRSpellChecker.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/HOCRTextExporter.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/OutputEditorHOCR.cc.o CMakeFiles/gimagereader.dir/gtk/src/hocr/XmlUtils.cc.o CMakeFiles/gimagereader.dir/gtk/src/main.cc.o CMakeFiles/gimagereader.dir/gtk/src/scanner/ScannerSane.cc.o CMakeFiles/gimagereader.dir/gimagereader.gresource.c.o -o gimagereader-gtk -ltesseract -larchive -lgtkmm-3.0 -latkmm-1.6 -lgdkmm-3.0 -lgiomm-2.4 -lgtk-3 -lgdk-3 -latk-1.0 -lcairo-gobject -lgio-2.0 -lpangomm-1.4 -lglibmm-2.4 -lcairomm-1.0 -lsigc-2.0 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -lcairo -lgdk_pixbuf-2.0 -lgobject-2.0 -lglib-2.0 -lgtksourceviewmm-3.0 -lgtkmm-3.0 -latkmm-1.6 -lgdkmm-3.0 -lgiomm-2.4 -lpangomm-1.4 -lglibmm-2.4 -lcairomm-1.0 -lsigc-2.0 -lgtksourceview-3.0 -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lgtkspellmm-3.0 -lgtkspell3-3 -lenchant-2 -lgtkmm-3.0 -latkmm-1.6 -lgdkmm-3.0 -lgiomm-2.4 -lgtk-3 -lgdk-3 -latk-1.0 -lcairo-gobject -lgio-2.0 -lpangomm-1.4 -lglibmm-2.4 -lcairomm-1.0 -lsigc-2.0 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -lcairo -lgdk_pixbuf-2.0 -lgobject-2.0 -lglib-2.0 -lcairomm-1.0 -lcairo -lsigc-2.0 -lpangomm-1.4 -lglibmm-2.4 -lcairomm-1.0 -lsigc-2.0 -lpangocairo-1.0 -lpango-1.0 -lgobject-2.0 -lglib-2.0 -lharfbuzz -lcairo -lpoppler-glib -lgobject-2.0 -lglib-2.0 -lcairo -ljson-glib-1.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lxml++-2.6 -lxml2 -lglibmm-2.4 -lgobject-2.0 -lglib-2.0 -lsigc-2.0 /usr/lib/x86_64-linux-gnu/libjpeg.so -lfontconfig -lfreetype -lzip -lsane -ldjvulibre -lenchant-2 -lpodofo -ldl -lgtkmm-3.0 -latkmm-1.6 -lgdkmm-3.0 -lgiomm-2.4 -lgtk-3 -lgdk-3 -latk-1.0 -lcairo-gobject -lgio-2.0 -lpangomm-1.4 -lglibmm-2.4 -lcairomm-1.0 -lsigc-2.0 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -lcairo -lgdk_pixbuf-2.0 -lgobject-2.0 -lglib-2.0 -lgtksourceviewmm-3.0 -lgtksourceview-3.0 -lgtkspellmm-3.0 -lgtkspell3-3 -lpoppler-glib -ljson-glib-1.0 -lxml++-2.6 -lxml2 /usr/lib/x86_64-linux-gnu/libjpeg.so -lfontconfig -lfreetype -lzip -lsane -ldjvulibre -lpodofo -ldl
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Config.cc.o: in function `Config::getAvailableLanguages()':
Config.cc:(.text+0x2eaa): undefined reference to `tesseract::TessBaseAPI::GetAvailableLanguagesAsVector(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*) const'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Config.cc.o: in function `tesseract::TessBaseAPI::Init(char const*, char const*)':
Config.cc:(.text._ZN9tesseract11TessBaseAPI4InitEPKcS2_[_ZN9tesseract11TessBaseAPI4InitEPKcS2_]+0x43): undefined reference to `tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, bool)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognize(std::vector<int, std::allocator<int> > const&, bool)::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x2616): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeImage(Cairo::RefPtr<Cairo::ImageSurface> const&, Recognizer::OutputDestination)::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x316c): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeImage(Cairo::RefPtr<Cairo::ImageSurface> const&, Recognizer::OutputDestination)::{lambda()#2}::operator()() const':
Recognizer.cc:(.text+0x3219): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
/usr/bin/ld: CMakeFiles/gimagereader.dir/gtk/src/Recognizer.cc.o: in function `Recognizer::recognizeBatch()::{lambda()#1}::operator()() const':
Recognizer.cc:(.text+0x40ea): undefined reference to `tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*)'
collect2: error: ld returned 1 exit status
So, the -ltesseract
seems to be visibly listed there, but other than that I've got no clue what's causing the error. Something missing since I've built Tesseract and Leptonica libraries from source as well? Thanks so much for your help so far.
Are you linking against the correct tesseract library? In particular, the one matching the included headers?
Are you linking against the correct tesseract library? In particular, the one matching the included headers?
Hmm, good question!
At least for the tessdata
(assuming that's what you meant by Tesseract libraries?), they seem to be in two places:
/usr/share/tessdata
/usr/share/tesseract-ocr/4.00/tessdata
I noticed that the issue had popped up elsewhere; i.e. especially in this thread: https://github.com/manisandro/gImageReader/issues/407#issuecomment-496127368
I tried setting the TESSDATA_PREFIX
env-var but that didn't help either, neither i.e. setting cmake
's flags to -DTESSDATA_PREFIX=/usr/share/tessdata
-- has the functionality been obsoleted?
CMake Warning:
Manually-specified variables were not used by the project:
TESSDATA_PREFIX
All further tips on what I might be missing here would be highly appreciated. Thanks.
TESSDATA_PREFIX
is a runtime environment variable which defines where the tessdata files are located, it is not related to any build-time setting. gImageReader detects tesseract via pkg-config, see https://github.com/manisandro/gImageReader/blob/master/CMakeLists.txt#L58. Check that pkg-config returns the desired tesseract includes and libs, if not, either tweak the gimagereader CMakeLists.txt or set PKG_CONFIG_LIBDIR
to the directory where the tesseract.pc
of your desired installation is located.
TESSDATA_PREFIX
is a runtime environment variable which defines where the tessdata files are located, it is not related to any build-time setting. gImageReader detects tesseract via pkg-config, see https://github.com/manisandro/gImageReader/blob/master/CMakeLists.txt#L58. Check that pkg-config returns the desired tesseract includes and libs, if not, either tweak the gimagereader CMakeLists.txt or setPKG_CONFIG_LIBDIR
to the directory where thetesseract.pc
of your desired installation is located.
Okay, thanks once more for the clarification!
pkg-config --list-all
does show Tesseract in the list, and /usr/include/tesseract
in the CMakeLists.txt that you linked to is (and has been) the correct location for -ltesseract
...
The tesseract.pc
file also seems to be in place in /usr/lib/pkgconfig
— nevertheless, since PKG_CONFIG_PATH
had not been set separately as an environment variable, I tried it once more by first setting and exporting the pkg-config path env-var with export PKG_CONFIG_PATH=/usr/lib/pkgconfig
just in case, yet no luck -- the build still fails at the same spot as previously mentioned.
I'm beginning to wonder if building Tesseract from source has left something crucial (for compiling gImageReader) uncompiled?
Solved it!
I nuked and paved any residues in dpkg
of libleptonica
-dev related material as well and then manually went for a bit of search & destroy -- i.e. cleared /usr/include/leptonica
, re-compiled both leptonica
and tesseract
from source, and now the compile of gImageReader WORKS! 👍
Thanks so much for your help, have a nice day!
Glad you solved it, cheers
Hi there, and first of all, thanks for the highly useful software you've created! :)
gImageReader to this day is my favorite go-to tool for Tesseract OCR reading, but sadly, even when building it from source on Ubuntu 21.14, it seems to be using Tesseract version 4.1.1 according to the software's "About" window.
Since I've noticed far better OCR accuracy in Tesseract 5.x that I have compiled from source and added to my system as the go-to Tesseract version, I would like to ask if it's in any way possible to use my newer Tesseract 5.x (git) version with gImageReader?
My
tesseract --version
shows the latest git version (5.0.1-43 as of this) + latest leptonica libraries are installed, but sadly, gImageReader seems to still stick to Tesseract 4.1.1 regardless.Thanks once again, and all info on this is highly appreciated.