ropensci / tesseract

Bindings to Tesseract OCR engine for R
https://docs.ropensci.org/tesseract
245 stars 26 forks source link

Configure fails to get leptonica include dir #61

Closed Enchufa2 closed 2 years ago

Enchufa2 commented 2 years ago

Tesseract 5.2 (in Fedora 37+) no longer requires leptonica in its .pc file. As a result, installation fails because the configure script does not get its include dir. Example:

** using staged installation
Found pkg-config cflags and libs!
Using PKG_CFLAGS=
Using PKG_LIBS=-L/usr/lib -ltesseract 
Using CXX11CPP: g++ -m64 -E -std=gnu++11
--------------------------- [ANTICONF] --------------------------------
Configuration failed to find 'tesseract' system library. Try installing:
 * deb: libtesseract-dev libleptonica-dev (Debian, Ubuntu, etc)
 * rpm: tesseract-devel leptonica-devel (Fedora, CentOS, RHEL)
 * brew: tesseract (Mac OSX)
If tesseract is already installed, check that 'pkg-config' is in your
PATH and PKG_CONFIG_PATH contains a tesseract.pc file. If pkg-config
is unavailable you can set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
-------------------------- [ERROR MESSAGE] ---------------------------
tools/test.cpp:2:10: fatal error: allheaders.h: No such file or directory
    2 | #include <allheaders.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.
--------------------------------------------------------------------
ERROR: configuration failed for package 'tesseract'

An additional to pkg-config --cflags --silence-errors lept would be required.

jeroen commented 2 years ago

Are you sure this not a bug in Fedora's tesseract package? On homebrew the leptonica.pc file still has this:

prefix=/usr/local/Cellar/tesseract/5.2.0
exec_prefix=${prefix}
bindir=${exec_prefix}/bin
datarootdir = /usr/local/share
datadir=${datarootdir}
libdir=${exec_prefix}/lib
includedir=${prefix}/include

Name: tesseract
Description: An OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google.
URL: https://github.com/tesseract-ocr/tesseract
Version: 5.2.0
Requires.private: lept
Libs: -L${libdir} -ltesseract -L/usr/local/Cellar/libarchive/3.6.1/lib -larchive -lcurl
Libs.private: -lpthread
Cflags: -I${includedir}
Enchufa2 commented 2 years ago

@manisandro I see in the changelog that you did the update. Any idea?

jeroen commented 2 years ago

It has not changed upstream: https://github.com/tesseract-ocr/tesseract/blob/HEAD/tesseract.pc.in#L13

Enchufa2 commented 2 years ago

That's not the one that is used in cmake-based installations. This one is: https://github.com/tesseract-ocr/tesseract/blob/main/tesseract.pc.cmake

jeroen commented 2 years ago

Ah did you switch the rpm to cmake? That looks like a bug upstream...

Let's see what they say: https://github.com/tesseract-ocr/tesseract/pull/3930

Enchufa2 commented 2 years ago

It seems so. Here: https://src.fedoraproject.org/rpms/tesseract/c/622725d18425696cdca5447558fd9e3ca892f858?branch=rawhide

@jeroen Thanks for the quick patch.

@manisandro Do you think we could add this downstream until upstream fixes this?

jeroen commented 2 years ago

Upstream has merged it already. Maybe you can cherry-pick this for the rpm: https://github.com/tesseract-ocr/tesseract/commit/aee19fcf8eb832

manisandro commented 2 years ago

Done for tesseract-5.2.0-5.fc37 and tesseract-5.2.0-5.fc38.