patcharats / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

TessAPI problem. Finding tessdata directory when used in scripting languages. #75

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Ok i have only tested it with Python and AutoHotkey but i guess this 
applies to any case where a script is interpreted by an interpreter.

Lets say we are using Python and have the following directory structure.
"C:\Python\python.exe".  Python interpreter.

"C:\tesseract\". This is where tesseract is installed. "tessdll.dll", 
"\tessdata\" and our script named "tessapi.py" reside in this directory.

Now as soon as we do a foreign funktion call to any of the tesserapi 
funktions that look in "\tessdata\" for settings or languagefiles the 
programm closes because it can't find them.
This is because the tessdll.dll tries to search for "\tessdata\" in the 
path of the process it was loaded from, namely "C:\Python\", and not where 
the tessdll.dll is located "C:\tesseract\".
A "tesseract.log" file in the "current working directory" is created, in 
which it tells which files couldn't be found.

I think a new function should be made available with which one can tell 
the API where "\tessdata\" is located.

For completeness sake, i am on WinXPSP2 and using the 2.01 binaries of 
tesseract.

Original issue reported on code.google.com by foom...@googlemail.com on 22 Oct 2007 at 1:45

GoogleCodeExporter commented 9 years ago
I am also using Winxp.Will you please upload "phython.exe" to enable me to 
experiment
with it?

Original comment by withbles...@gmail.com on 25 Oct 2007 at 11:06

GoogleCodeExporter commented 9 years ago
Maybe you can try to set environment "TESSDATA_PREFIX" to your tesseract home 
directory.

set TESSDATA_PREFIX=C:\tesseract\

Original comment by benstone...@gmail.com on 1 Nov 2007 at 2:37

GoogleCodeExporter commented 9 years ago
@withblessings
You can get python from www.python.org.
@benstonezhang
IIRC the TESSDATA_PREFIX env-var was a *nix only thing. But anyway. I would 
like to 
create an python module and i dont like the idea that a module changing my 
env-vars.
I could symlink the tessdata directory on the fly but that would be like the 
most 
fugly hack from hell. :/

Original comment by foom...@googlemail.com on 23 Nov 2007 at 4:27

GoogleCodeExporter commented 9 years ago
@foomxxx

Could you please share how to use the .dll from AutoHotKey?

I assume you got it working by now...

Original comment by bot.ai...@gmail.com on 15 Jan 2008 at 5:00

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I did just some testing in ahk because i suspected it to be a quirk with python 
but
after ahk simply died everytime i tryed to do a dllcall to tessdll i found the 
real
reason and did not bother to investigate further.
However if you like you can test it yourselve by copying tessdata directory to 
your
ahk directory as an awkward workaround. the rest is pretty simple. To be honest 
i
didn't make it to the point of parsing the datablob tessapi produces. However 
testing
in python didn't bring satisfying results so i stoped investigating in this 
matter
completely. Maybe when the problem i reported gets fixed i'll have another try. 
In
the meantime i suggest you use cmdret.ahk or this script to capture the output 
of
tesseract.exe .

Original comment by foom...@googlemail.com on 22 Jan 2008 at 7:02

GoogleCodeExporter commented 9 years ago
foomxxx can you share you ahk file for tessdll?
Thanks

Original comment by naveen.g...@gmail.com on 13 Feb 2008 at 2:41

GoogleCodeExporter commented 9 years ago
foomxxx,

I'd also like to have a copy of your AHK script. Can you confirm whether the 
bug has
been fixed? It has been almost a year since your last post on this, so I hope 
it has
b been fixed.

Thx,

D

Original comment by dblanch...@gmail.com on 8 Dec 2008 at 6:04