Closed GoogleCodeExporter closed 9 years ago
I would like to add my support for this. Specifically, it is needed for JNA but
would also help when linking with MinGW, assuming the official build is done
with Visual C++.
Original comment by JerseyChewi@gmail.com
on 26 Sep 2010 at 5:47
and I'm a Delphi Developer who was using a wrapper made for Delphi that was
BASED on the C wrapper from version 2.04 - now I'm stuck because this new
version (3.0) doesn't work with it :(
Can I try to import all the functions from baseapi.h? Will that work? I'm in
over my head on this particular issue...
Original comment by rfwo...@gmail.com
on 14 Oct 2010 at 11:18
I know nothing about Delphi but it's probably similar to the Java situation
because of name mangling. If you want to interface directly to the native
library without some additional native connector in the middle, name mangling
will be a barrier. Java can possibly work around it with the help of JNAerator
but I don't know if an equivalent exists for Delphi.
Original comment by JerseyChewi@gmail.com
on 15 Oct 2010 at 8:15
Can you guys help me out with something? I had a Delphi wrapper that was based
on the C Wrapper - it utilized a 'Recognize' class etc etc.
But now with version 3, as you indicate above, we can't use it.
Instead I have to import the API functions from the DLL - if I can get that
right, then surely I will succeed, but I don't know which API functions to
call, when, and in what order.
Please could somebody assist :)
Original comment by rfwo...@gmail.com
on 19 Oct 2010 at 3:34
This would also allow dynamic loading of the library with dlopen/LoadLibrary.
With C++ classes this is quite nasty.
I noticed however, that the tessdll wrapper code is still there (in the vs2008
directory) and changes were made to it between 2.04 and 3.00. Does it still
work? In a short test, I could not get any results.
Original comment by trop...@gmail.com
on 21 Oct 2010 at 2:35
I also couldn't get any results from it, but the main .h file remains unchanged
from version 2.04, so the hope was to simply just use the same wrapper as
before, replace the DLL and add the *new* combined eng language file in
/tessdata but the result of the OCR read is '~' which usually happens if
there's too much noise in an image or if I've given it the wrong pixel format
or something. Best would be to use 'the API' - but there a bajillion functions
in there and I need an example of what to set, when.
Original comment by rfwo...@gmail.com
on 21 Oct 2010 at 2:39
The call sequence is something like
tess.Init(...);
tess.SetVariable(...);
tess.SetImage(...);
tess.SetRectangle(...);
tess.Recognize(...);
tess.GetXXXText(...);
The only thing I could not figure out is how to use the Recognize(struct
ETEXT_DESC* monitor) method correctly (with monitor != NULL), i.e. how to get
the initial monitor variable.
Original comment by trop...@gmail.com
on 22 Oct 2010 at 8:17
troplin
Thanks for that, but I wonder if it's enough to proceed on.
Do you know if there is there a document somewhere that confirms the call
sequence?
Original comment by rfwo...@gmail.com
on 22 Oct 2010 at 10:01
I have that from the comments in the source code header file.
Original comment by trop...@gmail.com
on 22 Oct 2010 at 11:09
This is a good idea - who wants to do it?
Original comment by joregan
on 24 Oct 2010 at 10:03
>>This is a good idea - who wants to do it?
I would definitely do it in Delphi if I knew what to do - what needs to be
called etc etc.
I think DependencyWalker can show you the available function calls in a DLL and
the name to use etc.
Original comment by rfwo...@gmail.com
on 25 Oct 2010 at 10:31
It involves writing C++ rather than any other language. I'm probably more
familiar with C++ than the other people requesting this but I'm probably the
least knowledgable about Tesseract itself. I'm also quite busy. If no one else
steps up, I could give it a shot but I can't promise it'll be done quickly.
Original comment by JerseyChewi@gmail.com
on 25 Oct 2010 at 10:39
I can help testing it for JNA (using Tess4J).
Original comment by nguyen...@gmail.com
on 25 Oct 2010 at 2:13
>>This is a good idea - who wants to do it?
> I would definitely do it in Delphi if I knew what to do
Fail. Delphi will not work for this, even if it wasn't a single-platform
solution where a cross-platform solution is what's wanted.
Original comment by joregan
on 26 Oct 2010 at 9:02
>Fail. Delphi will not work for this, even if it wasn't a single-platform
solution where a cross-platform solution is what's wanted.
Then I'm rather trapped. I need to get version 3 working in Delphi and I have
no idea what to try and call in the DLL and in what order etc.
I think if the C wrapper was working like it was in version 2.04, then I could
just call the same functions as I did in 2.04.
Original comment by rfwo...@gmail.com
on 26 Oct 2010 at 9:17
The C Wrapper should provide the same functionality as the C++ API:
- Every method of the C++ class should have an equvalent C function
- The C function should call the C++ method and nothing more.
- Every C function should take an additional parameter, which represents the
object itself (like the this pointer in C++)
- The type of this additional object parameter is a transparent data structure,
which either contains the actual C++ object, or is typecasted to it.
Original comment by trop...@gmail.com
on 28 Oct 2010 at 6:37
I could do this in my spare time. I have not that much time however, so I
cannot say until when.
Original comment by trop...@gmail.com
on 28 Oct 2010 at 6:47
Man I'm desperate for this... I need to know, was the 'wrapper' for version
2.04 able to let you set config settings as well? That could actually solve my
problem and possibly make my day/week/month. For example it would help a great
deal if I could tell stupid Tessaract that my font is mono-spaced.
Original comment by rfwo...@gmail.com
on 29 Oct 2010 at 5:24
actually I don't to rebuild the version 2.04. The code for this old wrapper is
still here but it does not work anymore because the internals of the engine
have changed.
What I want is a new wrapper that should replicate the C++ BaseAPI as close as
possible. This is the only solution that is really future-proof.
In version 2.04 there is no elegant way to set config settings from the API,
but I think I can remember that the API used a special config file somewhere in
the TESSDATA folder (named api_config or similar). You could try to modify this.
Original comment by trop...@gmail.com
on 29 Oct 2010 at 7:01
I changed my mind and I wont do it (reasons see in the closed case #386).
So I stick with version 2.04
Original comment by trop...@gmail.com
on 1 Nov 2010 at 9:31
I just had an idea. Maybe we could use SWIG. Not that creating this wrapper by
hand would be a huge amount of work but having it done automatically would be
nice if it does a good enough job. I've never used SWIG before but I've found a
separate branch that has the C wrapper feature, which was developed as a GSOC
project. It hasn't been merged back to trunk yet but it's not too ancient.
Could be worth a look.
http://swig.svn.sourceforge.net/viewvc/swig/branches/gsoc2008-maciekd
Original comment by JerseyChewi@gmail.com
on 1 Nov 2010 at 9:56
Troplin>
That's a shame, the community still needs this, and, I don't think you've
conclusively determined what you say in case 386 - the case was closed as
'invalid'. Anyways, I hope you change your mind.
As for me, I don't care which version I use, 2.04 or 3, as long as I can get
better results than I am currently getting with the 2.04 wrapper - such as the
ability to include settings. Perhaps you can help write a proper API wrapper
for 2.04?
Original comment by rfwo...@gmail.com
on 1 Nov 2010 at 1:40
> Comment 22 by rfwoolf, Today (2 hours ago)
> Troplin>
> That's a shame, the community still needs this, and, I don't think you've
> conclusively determined what you say in case 386 - the case was closed as
> 'invalid'. Anyways, I hope you change your mind.
I'll consider this a vote to reopen.
Original comment by joregan
on 1 Nov 2010 at 4:36
The actual writing of the wrapper is not the problem. That's made in no time.
But there is a lot more to figure out for me, as I don't really know how this
google code works, how I can contribute and so on.
But I'm still here, and since my other request is not conclusively closed, I'll
see what I can do.
Original comment by trop...@gmail.com
on 2 Nov 2010 at 4:17
@troplin:
I think it would good to move discussion about wrapper to Tesseract Developers
forum/mailing list http://groups.google.com/group/tesseract-dev
Original comment by zde...@gmail.com
on 2 Nov 2010 at 7:23
So I had a go with SWIG just now. I ran it against baseapi.h, pageiterator.h
and resultiterator.h. It generated three files, none of which are very pretty,
but tesseract_proxy.h does look very promising so I've uploaded it for you all
to look at. I think this could be exactly what we need. I'm not sure when I'll
have a chance to actually try this stuff out but if anyone's interested, I'll
upload the other files.
It's worth mentioning that one warning was generated but at least it was only
one. :)
pageiterator.h:73: Warning(503): Can't wrap 'operator =' unless renamed to a
valid identifier.
Original comment by JerseyChewi@gmail.com
on 16 Nov 2010 at 11:36
Attachments:
Any news on this? I'm officially offering to pay for this now (and of course
the results must be shared with the community).
I'm offering USD100 -- that's all I can really afford, I'm a South African and
the South African Rand currency is rather weak.
If you're interested, contact me so we can arrange it.
What worries me however, is that after we finally have a wrapper, I still need
to write the 'Delphi' wrapper which calls the C wrapper - which I don't really
know how to do, so this would be like paying 100USD for only half the solution,
but I'm trapped here, I have no choice, my job and reputation is at stake ! :)
Original comment by rfwo...@gmail.com
on 15 Dec 2010 at 2:57
Here are the other files that SWIG produced. This may actually be sufficient to
use, I just haven't had a chance to try it. See if it works for you. I like the
fact that it's been automatically generated, so what you get is a consistent
mapping of the entire interface to C.
I produced these files from Jimmy's github repo a month ago. It hasn't changed
since but I don't know if there have been other changes elsewhere.
Original comment by JerseyChewi@gmail.com
on 15 Dec 2010 at 3:10
Attachments:
I can feel your desperation so let me expand on that. I don't know Delphi but I
would like to look into this further myself. While I appreciate the incentive,
I am already bogged down with work and my wife is due to give birth any day now!
Original comment by JerseyChewi@gmail.com
on 15 Dec 2010 at 3:18
Thanks for these files - I'm not even sure what to do with them...
Can you explain what these files are all about? For example, what does 'proxy'
mean in this case? And surely we only need 1 file for the wrapper?
Original comment by rfwo...@gmail.com
on 15 Dec 2010 at 3:21
SWIG is a program for automatically generating wrappers around C/C++ libraries
for other languages like Java, PHP, Python, whatever. As a GSoC project,
someone got it to also create C wrappers around C++ libraries. Just what we
need. Unfortunately it was never merged upstream and I don't know why. I think
I read that some corner cases weren't handled properly but I'm hoping it works
well enough for us.
The "wrap" files are the magic that makes SWIG work. The "proxy" files present
this magic in the form of an interface we can use. I think you build all these
files as part of a regular Tesseract build and then use the functions provided
by tesseract_proxy.h. This file is somewhat readable. I don't know how to use
the C++ API yet but the C equivalent is something like this.
TessBaseAPI *foo = new_TessBaseAPI();
TessBaseAPI_SetRectangle(foo, 1, 2, 3, 4);
Looking at tesseract_proxy.h more closely, I see some comments like "aaaaaa"
and "whoa" which are a little worrying! But on the other hand, only one warning
was emitted so maybe it's okay. If there is a problem, maybe I could contact
the GSoC student but only after I've tried it myself.
Original comment by JerseyChewi@gmail.com
on 15 Dec 2010 at 3:45
Chewi> Thank you for this. Can I be a further pain and just ask you to clarify
this for me...
"I think you build all these files as part of a regular Tesseract build and
then use the functions provided by tesseract_proxy.h"
You're saying, you take these files, and then re-build the DLL? Do you need
Visual Studio, or C++ or what? I am a Delphi Developer, so I'll need someone to
do this for me.
I guess this goes back to my offer.
If nobody can offer to help me soon, I'll have to go to elance or a freelancer
website and get somebody to do this.
Oh, and if anybody *DOES* decide to re-build the DLL, please do me a big favour
and make sure it doesn't require any dependancies, like the C++ runtime
distributable? The whole problem with version 2.04 is that my wrapper for the
DLL keeps on hard-crashing the application and on some PC's I get no
functionality at all - just a hardcrash... very worrisome :p
Original comment by rfwo...@gmail.com
on 15 Dec 2010 at 3:54
Yes, you'll need to rebuild it with Visual C++. I'm not sure if you're more
afraid of C++ itself or just having to pay for Visual C++ but the Express
version is free. That's what I use.
Statically linking the C++ runtime is bad practise in my opinion but I do come
from a Linux background, which is a bit different.
Original comment by JerseyChewi@gmail.com
on 15 Dec 2010 at 4:05
Chewi> as you have generated the wrapper files, could be send us your interface
file as well something like (tesseract.i)
Original comment by FreeT...@gmail.com
on 1 Jan 2011 at 12:48
Chewi> Many thank. So your tesseract.i is for tesseract2 not 3. Rite?
Original comment by FreeT...@gmail.com
on 4 Jan 2011 at 7:07
No, all this is for v3.
Original comment by JerseyChewi@gmail.com
on 4 Jan 2011 at 7:26
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
Finally, got it working though having numerous edges awaiting for smoothening
>7z x swig.7z
>cd swig
>python setup.py clean
>python setup.py build
>sudo python setup.py install --prefix=/usr
>python test.py
Result= The (quick) [brown] {fox} jumps!
Over the $43,456.78 <lazy> #90 dog
& duck/goose, as 12.5% of E-mail
from aspammer@website.com is spam.
Der ,,schnelle” braune Fuchs springt
uber den faulen Hund. Le renard brun
<<rapide» saute par-dessus le chien
paresseux. La volpe marrone rapida
salta sopra il cane pigro. El zorro
marrén répido salta sobre el perro
perezoso. A raposa marrom répida
salta sobre o cio preguieoso.
Have fun.
Original comment by FreeT...@gmail.com
on 12 Jan 2011 at 6:32
Attachments:
rfwoolf> see if comment 44 could be of any help to u?
>Any news on this? I'm officially offering to pay for this now (and of course
the >results must be shared with the community).
>I'm offering USD100 -- that's all I can really afford, I'm a South African and
the >South African Rand currency is rather weak.
>If you're interested, contact me so we can arrange it.
Original comment by FreeT...@gmail.com
on 12 Jan 2011 at 6:35
Thanks FreeToGo. I hope I'm not setting myself up for disappointment by getting
a little excited - I've been through a lot in this project waiting to be able
to use the new version of Tesseract in Delphi.
Firstly, "Finally, got it working though having numerous edges awaiting for
smoothening" -- what exactly do you mean by that?
Secondly...
I'm not at a proper PC at the moment so I can't see what's in your swig.7z
attachment, as I understand it we need a DLL that can accept C function calls
and thus has a C wrapper. Then I can translate the C function calls ("C
Wrapper") in Delphi so that I can call them using Delphi.
I *think* I also need the DLL to be 'statically linked' so that the client PC
that gets this DLL (which are all running Windows) don't need anything EXTRA -
just deploy the DLL, and my EXE and off you go.
What exactly is it that you've done, and I apologise for my ignorance.
Thanks :D
Original comment by rfwo...@gmail.com
on 12 Jan 2011 at 7:12
svn version (as of today) attached
Original comment by FreeT...@gmail.com
on 12 Jan 2011 at 9:39
Attachments:
It had occurred to me that we could just go straight from C++ to Java or
whatever with SWIG but that would probably still hit the MSVC/MinGW
compatibility issue. I'm not sure if I care about that anymore though. Anyone
using MinGW should be capable of building their own copy of Tesseract. SWIG
doesn't support Delphi though.
Anyway, I'm glad you got this working. Did it manage to map the API cleanly to
Python? It seemed to have a couple of minor problems with C.
Original comment by JerseyChewi@gmail.com
on 12 Jan 2011 at 10:02
"SWIG doesn't support Delphi though." -- no of course not, but as per the very
first comment of this thread: "Several applications can only use C function
calls in their interfacing with Tesseract". In order to use Delphi with
Tesseract, we use C function calls of the DLL. The whole problem has been that
the old 'C Wrapper' of version 2 has not been brought forward properly to
version 3. Thus there's no way I can use this tesseract in Delphi - same goes
for a couple of other people.
Original comment by rfwo...@gmail.com
on 12 Jan 2011 at 10:09
rtwoolf>which version of Delphi u are using? OS? probably xp
Original comment by FreeT...@gmail.com
on 12 Jan 2011 at 12:19
Original issue reported on code.google.com by
nguyen...@gmail.com
on 26 Sep 2010 at 4:20