Open GoogleCodeExporter opened 8 years ago
I wish I could help but I don't have Raspberry
Original comment by FreeT...@gmail.com
on 14 Jul 2014 at 10:28
After trying the shotgun approach, I found a way that works. The 0.7.4 version
is not promoted on the site, so it took some time finding it. Most of the
installed programs are unnecessary, but it will take some time figuring out
what is needed. This is what worked for me:
sudo apt-get install python-distutils-extra tesseract-ocr tesseract-ocr-eng
libopencv-dev libtesseract-dev libleptonica-dev python-all-dev swig libcv-dev
python-opencv python-numpy python-setuptools build-essential subversion
sudo apt-get install tesseract-ocr-eng tesseract-ocr-dev libleptonica-dev
python-all-dev swig libcv-dev
sudo svn checkout
http://python-tesseract.googlecode.com/svn/python-tesseract-0.7.4/
sudo python setup.py build
sudo python setup.py install
Original comment by JorenVra...@gmail.com
on 14 Jul 2014 at 1:23
It is great to know that it works. Have you done any modification on the codes?
Also, what tesseract version you are using?
I need your input so that I could backport 0.7.4 to the mainstream version
hopefully.
Also, mind telling me whether you are doing it for fun or for work?
Joe
Original comment by FreeT...@gmail.com
on 14 Jul 2014 at 2:42
I did not modify the code, just checked it out with subversion and installed
with setup.py build and setup.py install. I have added 2 files, these contain
the output of the "sudo python setup.py build" and the "sudo python setup.py
install" commands.
I use tesseract version 3.02 (latest available version on raspbian).
At the moment I use python-tesseract for a school project.
P.S.
The necessary programs seem to be(some of which are already installed on
raspbian):
sudo apt-get install tesseract-ocr tesseract-ocr-eng libtesseract-dev
libleptonica-dev python-all-dev swig build-essential subversion
python-setuptools
Original comment by JorenVra...@gmail.com
on 14 Jul 2014 at 7:21
Attachments:
I was able to get python-tesseract 0.7.4 to work wont the Raspberry Pi with
Tesseract 3.02, but not with Tesseract 3.03-rc1 (revision 1049) and Leptonica
1.70 built from source. I reinstalled the libraries after compiling from
source. Here's the error I get:
pi@raspberrypi:~/ocr/python-tesseract-0.7.4$ python
Python 2.7.3 (default, Mar 18 2014, 05:13:23)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> import cv
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "tesseract.py", line 18, in swig_import_helper
import _tesseract
ImportError:
/usr/local/lib/python2.7/dist-packages/python_tesseract-0.7.4-py2.7-linux-armv6l
.egg/_tesseract.so: undefined symbol:
_ZN9tesseract11TessBaseAPI14NormalizeTBLOBEP5TBLOBP3ROWbP6DENORM
Here's version info:
pi@raspberrypi:~$ tesseract -v
tesseract 3.03
leptonica-1.70
libjpeg 8d : libpng 1.2.49 : libtiff 3.9.6 : zlib 1.2.7
Original comment by gcap...@gmail.com
on 18 Aug 2014 at 4:01
You make need to comment out NormalizeTBLOBE... in the include file
Original comment by FreeT...@gmail.com
on 19 Aug 2014 at 8:31
I could not find where one would comment out NormalizeTBLOBE... in the include
file. Can you give me more details? Thanks for the help!
Original comment by gcap...@gmail.com
on 19 Aug 2014 at 4:54
locate baseapi_mini.h
comment out the following line
static void NormalizeTBLOB(TBLOB *tblob, ROW *row, bool numeric_mode);
Original comment by FreeT...@gmail.com
on 19 Aug 2014 at 8:25
Thank you! I had to also comment out:
void SetFillLatticeFunc(FillLatticeFunc f);
Boxa* GetComponentImages(PageIteratorLevel level, bool text_only, Pixa** pixa,
int** blockids);
void GetFeaturesForBlob(TBLOB* blob, const DENORM& denorm, INT_FEATURE_ARRAY,
int_features, int* num_features, int* FeatureOutlineIndex);
void RunAdaptiveClassifier(TBLOB* blob, const DENORM& denorm, int
num_max_matches, int* unichar_ids, float* ratings, int* num_matches_returned);
Boxa* GetTextlines(Pixa** pixa, int** blockids);
Original comment by gcap...@gmail.com
on 20 Aug 2014 at 2:57
Given you knew the skill, you should have no problem to brave for a newer
version.
Have fun.
Original comment by FreeT...@gmail.com
on 20 Aug 2014 at 3:14
One more thing... I'm getting a segmentation fault on this line:
api.End()
I attached the code.
Original comment by gcap...@gmail.com
on 20 Aug 2014 at 4:09
Attachments:
then comment out this line
#api.End()
Original comment by FreeT...@gmail.com
on 20 Aug 2014 at 4:33
Actually, I spoke too soon. I assumed it was the last line (api.End()) that
was causing the problem, since the print statement prior to that line was
executed. It seems that just having api = tesseract.TessBaseAPI() creates an
error upon termination. Sometimes instead of segmentation fault, it gives the
error:
*** glibc detected *** python: corrupted double-linked list: 0x01864570 ***
Aborted
It is unpredictable whether it gives that error or segmentation fault.
Original comment by gcap...@gmail.com
on 20 Aug 2014 at 5:05
Could you try a newer version. The old version did have memory leak.
Original comment by FreeT...@gmail.com
on 20 Aug 2014 at 8:38
First I tried 0.9 and could not compile. Attached is the output.
Then I tried r444 and when I tried import tesseract, I got:
pi@raspberrypi:~/ocr/python-tesseract-r444$ python
Python 2.7.3 (default, Mar 18 2014, 05:13:23)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "tesseract.py", line 18, in swig_import_helper
import _tesseract
ImportError:
/usr/local/lib/python2.7/dist-packages/python_tesseract-r444-py2.7-linux-armv6l.
egg/_tesseract.so: undefined symbol: cvSetData
I looked for cvSetData in baseapi_mini.h, but could not find it and it's not
obvious to me which file to modify.
Original comment by gcap...@gmail.com
on 20 Aug 2014 at 5:33
Attachments:
as u have used the newest version of tesseract , u better use the newest
version of svn
Original comment by FreeT...@gmail.com
on 20 Aug 2014 at 6:09
I'm using, which is
svn, version 1.7.5 (r1336830)
compiled Mar 22 2014, 03:08:50
You think I need to upgrade to 1.8.x ?
Original comment by gcap...@gmail.com
on 20 Aug 2014 at 6:26
[deleted comment]
svn checkout http://python-tesseract.googlecode.com/svn/trunk/ python-tesseract
cd python-tesseract/src
python setup.py build
python setup.py install
make sure than tesseract is 3.0.3 and leptonica 1.7
Original comment by FreeT...@gmail.com
on 20 Aug 2014 at 6:36
I followed those exact instructions and got the following error:
...
running build
running build_py
file tesseract.py (for module tesseract) not found
file tesseract.py (for module tesseract) not found
running build_ext
building '_tesseract' extension
swigging tesseract.i to tesseract_wrap.cpp
swig -python -c++ -I/usr/include/tesseract -I/usr/include/leptonica
-I/usr/include/opencv2 -o tesseract_wrap.cpp tesseract.i
tesseract.i:98: Error: Unable to find 'renderer.h'
error: command 'swig' failed with exit status 1
Original comment by gcap...@gmail.com
on 21 Aug 2014 at 3:23
The missing files were under /usr/local/include , so I modified makefile.shsh
and setup.py to have the correct paths.
Original comment by gcap...@gmail.com
on 21 Aug 2014 at 4:21
Ok, I successfully installed 0.9, but I'm still getting the same error:
Python 2.7.3 (default, Mar 18 2014, 05:13:23)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tesseract
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tesseract.py", line 26, in <module>
_tesseract = swig_import_helper()
File "tesseract.py", line 22, in swig_import_helper
_mod = imp.load_module('_tesseract', fp, pathname, description)
ImportError: ./_tesseract.so: undefined symbol: cvSetData
Original comment by gcap...@gmail.com
on 21 Aug 2014 at 4:56
Any ideas on the error in #22 above. I did find you got a similar in the past
according to this issue:
https://code.google.com/p/python-tesseract/issues/detail?id=7
Here's my OpenCV version:
>>> from cv2 import __version__
>>> __version__
'2.4.8'
>>>
I have no problem importing either cv or cv2. Could part of the problem be
that I have
OpenCV 2.3.1 in
/usr/share/OpenCV
and
OpenCV 2.4.8 in
/usr/local/share/OpenCV
I did rename the directory that 2.3.1 sits in, but that did not seem to help.
Original comment by gcap...@gmail.com
on 22 Aug 2014 at 4:32
could you create a ssh account for me and sent it to my gmail account?
Original comment by FreeT...@gmail.com
on 24 Aug 2014 at 8:57
Original issue reported on code.google.com by
JorenVra...@gmail.com
on 13 Jul 2014 at 9:00