Closed GoogleCodeExporter closed 9 years ago
Confirmed here (also on Arch Linux).
$ python2
Python 2.7.3 (default, Apr 24 2012, 00:00:54)
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "tesseract.py", line 18, in swig_import_helper
import _tesseract
ImportError:
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg/_tesseract.so: undefined symbol:
_ZN9tesseract11TessBaseAPI18SetFillLatticeFuncEMNS_7WordrecEFvRK6MATRIXRKP8list_
recRK10UNICHARSETP12BlamerBundleE
The build and install process went as follows:
$ python2 setup.py build
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_tesseract' extension
swigging tesseract.i to tesseract_wrap.cpp
swig -python -c++ -I/usr/include/tesseract -I/usr/include/leptonica -o
tesseract_wrap.cpp tesseract.i
/usr/include/tesseract/publictypes.h:73: Warning 462: Unable to set
dimensionless array variable
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
tesseract_wrap.cpp -o build/temp.linux-x86_64-2.7/tesseract_wrap.o
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
main_dummy.cpp -o build/temp.linux-x86_64-2.7/main_dummy.o
In file included from /usr/include/python2.7/Python.h:8:0,
from main_dummy.h:3,
from main_dummy.cpp:25:
/usr/include/python2.7/pyconfig.h:1161:0: warning: "_POSIX_C_SOURCE" redefined
[enabled by default]
In file included from /usr/include/stdio.h:28:0,
from /usr/include/leptonica/alltypes.h:20,
from /usr/include/leptonica/allheaders.h:23,
from main_dummy.cpp:18:
/usr/include/features.h:164:0: note: this is the location of the previous
definition
In file included from /usr/include/python2.7/Python.h:8:0,
from main_dummy.h:3,
from main_dummy.cpp:25:
/usr/include/python2.7/pyconfig.h:1183:0: warning: "_XOPEN_SOURCE" redefined
[enabled by default]
In file included from /usr/include/stdio.h:28:0,
from /usr/include/leptonica/alltypes.h:20,
from /usr/include/leptonica/allheaders.h:23,
from main_dummy.cpp:18:
/usr/include/features.h:166:0: note: this is the location of the previous
definition
main_dummy.cpp: In function ‘int readBuf(const char*, l_uint8*)’:
main_dummy.cpp:54:21: warning: ignoring return value of ‘size_t fread(void*,
size_t, size_t, FILE*)’, declared with attribute warn_unused_result
[-Wunused-result]
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
fmemopen.c -o build/temp.linux-x86_64-2.7/fmemopen.o
g++ -pthread -shared
-Wl,-O1,--sort-common,--as-needed,-z,relro,--hash-style=gnu
build/temp.linux-x86_64-2.7/tesseract_wrap.o
build/temp.linux-x86_64-2.7/main_dummy.o build/temp.linux-x86_64-2.7/fmemopen.o
-L/usr/lib -lstdc++ -ltesseract -llept -lpython2.7 -o
build/lib.linux-x86_64-2.7/_tesseract.so
$ sudo python2 setup.py install
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running install
running bdist_egg
running egg_info
writing python_tesseract.egg-info/PKG-INFO
writing top-level names to python_tesseract.egg-info/top_level.txt
writing dependency_links to python_tesseract.egg-info/dependency_links.txt
unrecognized .svn/entries format in
reading manifest file 'python_tesseract.egg-info/SOURCES.txt'
writing manifest file 'python_tesseract.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/_tesseract.so -> build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/tesseract.py -> build/bdist.linux-x86_64/egg
byte-compiling build/bdist.linux-x86_64/egg/tesseract.py to tesseract.pyc
creating stub loader for _tesseract.so
byte-compiling build/bdist.linux-x86_64/egg/_tesseract.py to _tesseract.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/PKG-INFO ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/SOURCES.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/dependency_links.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/top_level.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
tesseract: module references __file__
creating dist
creating 'dist/python_tesseract-tesseract-py2.7-linux-x86_64.egg' and adding
'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing python_tesseract-tesseract-py2.7-linux-x86_64.egg
removing
'/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.
egg' (and everything under it)
creating
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg
Extracting python_tesseract-tesseract-py2.7-linux-x86_64.egg to
/usr/lib/python2.7/site-packages
python-tesseract tesseract is already the active version in easy-install.pth
Installed
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg
Processing dependencies for python-tesseract==tesseract
Finished processing dependencies for python-tesseract==tesseract
Original comment by digitald...@gmail.com
on 2 May 2012 at 5:55
I guess that your version of tesseract is 3.01 not 3.02.
Try commenting lines 582-586 in baseapi_mini.h
//#if !defined(__windows__)
/** Sets Wordrec::fill_lattice_ function to point to the given function. */
// void SetFillLatticeFunc(FillLatticeFunc f);
//#endif
Original comment by FreeT...@gmail.com
on 2 May 2012 at 6:24
Yes, you are right; tesseract is version 3.01.
Commenting the lines does not help. I get the same output as before.
Anyway, thanks for your time! Much appreciated! :)
Original comment by digitald...@gmail.com
on 2 May 2012 at 6:45
What is the error message? Is it still the same?
Original comment by FreeT...@gmail.com
on 2 May 2012 at 7:08
Clean it first.
python setup.py clean
Original comment by FreeT...@gmail.com
on 2 May 2012 at 7:10
Yes, I did run a clean first, but apparently I didn't pay enough attention: the
error is now a different 'undefined symbol':
$ python2
Python 2.7.3 (default, Apr 24 2012, 00:00:54)
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "tesseract.py", line 18, in swig_import_helper
import _tesseract
ImportError:
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg/_tesseract.so: undefined symbol:
_ZN9tesseract11TessBaseAPI18GetComponentImagesENS_17PageIteratorLevelEbPP4PixaPP
i
And the full output from the building and installing:
$ sudo python2 setup.py clean; python2 setup.py build; sudo python2 setup.py
install
Password:
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running clean
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_tesseract' extension
swigging tesseract.i to tesseract_wrap.cpp
swig -python -c++ -I/usr/include/tesseract -I/usr/include/leptonica -o
tesseract_wrap.cpp tesseract.i
/usr/include/tesseract/publictypes.h:73: Warning 462: Unable to set
dimensionless array variable
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
tesseract_wrap.cpp -o build/temp.linux-x86_64-2.7/tesseract_wrap.o
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
main_dummy.cpp -o build/temp.linux-x86_64-2.7/main_dummy.o
In file included from /usr/include/python2.7/Python.h:8:0,
from main_dummy.h:3,
from main_dummy.cpp:25:
/usr/include/python2.7/pyconfig.h:1161:0: warning: "_POSIX_C_SOURCE" redefined
[enabled by default]
In file included from /usr/include/stdio.h:28:0,
from /usr/include/leptonica/alltypes.h:20,
from /usr/include/leptonica/allheaders.h:23,
from main_dummy.cpp:18:
/usr/include/features.h:164:0: note: this is the location of the previous
definition
In file included from /usr/include/python2.7/Python.h:8:0,
from main_dummy.h:3,
from main_dummy.cpp:25:
/usr/include/python2.7/pyconfig.h:1183:0: warning: "_XOPEN_SOURCE" redefined
[enabled by default]
In file included from /usr/include/stdio.h:28:0,
from /usr/include/leptonica/alltypes.h:20,
from /usr/include/leptonica/allheaders.h:23,
from main_dummy.cpp:18:
/usr/include/features.h:166:0: note: this is the location of the previous
definition
main_dummy.cpp: In function ‘int readBuf(const char*, l_uint8*)’:
main_dummy.cpp:54:21: warning: ignoring return value of ‘size_t fread(void*,
size_t, size_t, FILE*)’, declared with attribute warn_unused_result
[-Wunused-result]
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/include/tesseract -I/usr/include/leptonica -I/usr/include/python2.7 -c
fmemopen.c -o build/temp.linux-x86_64-2.7/fmemopen.o
g++ -pthread -shared
-Wl,-O1,--sort-common,--as-needed,-z,relro,--hash-style=gnu
build/temp.linux-x86_64-2.7/tesseract_wrap.o
build/temp.linux-x86_64-2.7/main_dummy.o build/temp.linux-x86_64-2.7/fmemopen.o
-L/usr/lib -lstdc++ -ltesseract -llept -lpython2.7 -o
build/lib.linux-x86_64-2.7/_tesseract.so
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running install
running bdist_egg
running egg_info
writing python_tesseract.egg-info/PKG-INFO
writing top-level names to python_tesseract.egg-info/top_level.txt
writing dependency_links to python_tesseract.egg-info/dependency_links.txt
unrecognized .svn/entries format in
reading manifest file 'python_tesseract.egg-info/SOURCES.txt'
writing manifest file 'python_tesseract.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/_tesseract.so -> build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/tesseract.py -> build/bdist.linux-x86_64/egg
byte-compiling build/bdist.linux-x86_64/egg/tesseract.py to tesseract.pyc
creating stub loader for _tesseract.so
byte-compiling build/bdist.linux-x86_64/egg/_tesseract.py to _tesseract.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/PKG-INFO ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/SOURCES.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/dependency_links.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/top_level.txt ->
build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
tesseract: module references __file__
creating dist
creating 'dist/python_tesseract-tesseract-py2.7-linux-x86_64.egg' and adding
'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing python_tesseract-tesseract-py2.7-linux-x86_64.egg
removing
'/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.
egg' (and everything under it)
creating
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg
Extracting python_tesseract-tesseract-py2.7-linux-x86_64.egg to
/usr/lib/python2.7/site-packages
python-tesseract tesseract is already the active version in easy-install.pth
Installed
/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.e
gg
Processing dependencies for python-tesseract==tesseract
Finished processing dependencies for python-tesseract==tesseract
Original comment by digitald...@gmail.com
on 2 May 2012 at 7:16
You have three choices:
1. compiles the svn version of tesseract-ocr
http://code.google.com/p/tesseract-ocr/wiki/TesseractSvnInstallation
2. Download the compiled version of python-tesseract
http://code.google.com/p/python-tesseract/downloads/list
3, comment out all the missing library in baseapi_mini.h
lines: 404-406
//Boxa* GetComponentImages(PageIteratorLevel level,
// bool text_only,
// Pixa** pixa, int** blockids);
Original comment by FreeT...@gmail.com
on 3 May 2012 at 3:16
Thanks for your suggestions!
I tried #3, which didn't work. I still got the 'undefined symbol' error,
although it was yet another different 'symbol' this time.
Suggestion #2 is not really a possibility, as Arch Linux doesn't use rpm or
deb. There are tools to convert those package types, but I would rather avoid
them.
I went for suggestion #1, building tesseract-ocr from SVN. It got build and
installed without problems (I have tested it from the command line), but when I
try to build python-tesseract I know get the following error. It is probably a
missing dependency, but then I am not really sure what it should be. Any ideas?
$ sudo python2 setup.py clean; python2 setup.py build
Password:
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running clean
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_tesseract' extension
swigging tesseract.i to tesseract_wrap.cpp
swig -python -c++ -I/usr/local/include/tesseract -I/usr/include/leptonica -o
tesseract_wrap.cpp tesseract.i
/usr/local/include/tesseract/publictypes.h:78: Warning 462: Unable to set
dimensionless array variable
creating build/temp.linux-x86_64-2.7
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/local/include/tesseract -I/usr/include/leptonica
-I/usr/include/python2.7 -c tesseract_wrap.cpp -o
build/temp.linux-x86_64-2.7/tesseract_wrap.o
gcc -pthread -fno-strict-aliasing -march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DNDEBUG
-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector
--param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -fPIC -I.
-I/usr/local/include/tesseract -I/usr/include/leptonica
-I/usr/include/python2.7 -c main_dummy.cpp -o
build/temp.linux-x86_64-2.7/main_dummy.o
main_dummy.cpp:20:17: fatal error: img.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
Original comment by digitald...@gmail.com
on 3 May 2012 at 9:02
Img.h should be in either
/usr/include/tesseract/img.h
or
/usr/local/include/tesseract/img.h
where is your tesseract svn indeed installed?
Original comment by FreeT...@gmail.com
on 3 May 2012 at 9:11
if the directory /usr/local/include/tesseract is not used, remove it.
Original comment by FreeT...@gmail.com
on 3 May 2012 at 9:22
also referring to
http://code.google.com/p/python-tesseract/wiki/HowToCompilePythonTesseract
the steps should be
python config.py
python setup.py clean
python setup.py build
sudo python setup.py install
Original comment by FreeT...@gmail.com
on 3 May 2012 at 9:25
My tesseract SVN installation is just in a subdirectory of my home. I figured
that 'sudo make install' would make tesseract-ocr copy the appropriate files to
/usr/local/include, /usr/local/bin, etc., but apparently it is not the case. It
_does_ copy a few files to /usr/local/include/tesseract, but not img.h.
Should I use a specific prefix for the make install process? I could of course
just symlink the files, but I would rather avoid those kind of hacks if I can.
:)
Original comment by digitald...@gmail.com
on 3 May 2012 at 9:40
just copy img.h to /usr/local/include/tesseract then
Original comment by FreeT...@gmail.com
on 3 May 2012 at 10:00
Okay, for reference I needed to copy tesseract-ocr/image/img.h,
tesseract-ocr/ccutil/tprintf.h and tesseract-ocr/api/tesseractmain.h to
/usr/local/include/tesseract. It is now possible to build and install
python-tesseract, but I still get errors when trying to import tesseract:
$ python2
Python 2.7.3 (default, Apr 24 2012, 00:00:54)
[GCC 4.7.0 20120414 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.egg/tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "/usr/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_64.egg/tesseract.py", line 22, in swig_import_helper
_mod = imp.load_module('_tesseract', fp, pathname, description)
ImportError: libtesseract.so.3: cannot open shared object file: No such file or
directory
The path to libtesseract.so.3 on my computer is
/usr/local/lib/libtesseract.so.3. Where does python-tesseract expect
libtesseract.so.3 to be placed?
Original comment by digitald...@gmail.com
on 3 May 2012 at 1:40
Are u sure u have only one version of tesseract in your system?
if so, cp libtesseract.so.3 to /usr/lib as well
Original comment by FreeT...@gmail.com
on 3 May 2012 at 2:58
Yes, I uninstalled tesseract 3.01 before installing from SVN.
But copying libtesseract.so.3 to /usr/lib solved the problem---python-tesseract
now works like a charm! Thank you very much for your help! :)
Original comment by digitald...@gmail.com
on 3 May 2012 at 3:20
You need to be consistent with the use of Prefix
In your case, it should be
./configure --prefix=/usr
Original comment by FreeT...@gmail.com
on 3 May 2012 at 4:09
[deleted comment]
By the way, send me your compiled package to me after running the following
command
python setup.py bdist --format=gztar
Many thanks
Original comment by FreeT...@gmail.com
on 3 May 2012 at 4:44
Original issue reported on code.google.com by
marin.st...@gmail.com
on 2 May 2012 at 4:18