charlesw / tesseract

A .Net wrapper for tesseract-ocr
Apache License 2.0
2.26k stars 741 forks source link

Can the tesseract OCR run in Windows Phone 8 platform? #53

Open rsteng2 opened 10 years ago

rsteng2 commented 10 years ago

Since I successfully installed the tesseract OCR from the Nuget package manager in the Windows phone 8 app project, will it be able to run like normal? I ask this as i am still in the beginning stage of using the library, and i am still trying my best to solve some problem like "Error 1 The type or namespace name 'tesseract' could not be found (are you missing a using directive or an assembly reference?)". Truly thanks in advance.

charlesw commented 10 years ago

Hi Rsteng2, Currently Windows Phone 8 (or bellow) is not supported, only the full versions of .net 2.0 and above. While I could add a build configuration for Windows 8 I do not have tesseract dlls built for ARM architecture which would be required.

rsteng2 commented 10 years ago

Hi Charlesw. Truly thanks for replying the issue in short time. I have searched for a lot of OCR engines but mostly won't work in windows phone 8, which is actually part of my final year project. Only leadtools and http://www.devscope.net/products/DevScopeOCR (tesseract based) will work but funding is needed. I thought you have done for Windows phone 8 as it has Shared Windows Core with windows 8. http://lazure2.wordpress.com/2012/06/22/windows-phone-8-software-architecture-vs-that-of-windows-phone-7-7-5-and-the-upcoming-7-8/ I also wonder if Portable Class Library can make out the work, i will try as much as i can. Thanks again~

charlesw commented 10 years ago

The trouble here isn't so much the wrapper (what I've written) but the native tesseract-ocr library, which I don't believe would currently execute since it is built targeting x86 architecture. As I mentioned previously I can easily create a new target to generate a version of the wrapper that will execute on windows phone 8, assuming there isn't any compatibility issues.

I think all Devscope has done, though I don't know for sure, is recompile the native tesseract library targeting windows phone system, or something similar. Unfortunately I don't have a windows 8 phone so can't really do this myself however would be happy to offer some assistance\help if you choose to take this on. The first place to look might be a guide to porting an existing (c++) windows app to windows phone and doing the same to tesseract sources. I'd also create a new github project to make it easier for others to give you a hand here If you choose to do this.

rsteng2 commented 10 years ago

I am not having windows phone by now as well, however i can test it on emulator, just that the emulator need quite high requirement... and I will be determined to learn how to port the library, thank you very much. Just that i am still an amature in programming, it will need some time..

Nabeelhassan commented 10 years ago

please tell me what to do :/

i am also in this same trouble .... i need to use this ocr in windows phone 8 .. i tested it in my desktop app and it worked fine ...

charlesw commented 10 years ago

Just had a bit of a look into this and it seems that P/Invoke is NOT supported on Windows Phone 8 (Source).

According to this the only way to work with an native library, like tesseract, is to build a Windows Runtime Component for that library using C++/CX. This is unfortunately a completely different approach to using P/Invoke and would be (almost) akin to starting again from scratch.

charlesw commented 10 years ago

Here's the bug report for that is preventing us from supporting WP8 if you want WP8 support please up vote this issue as I do not have the time to implement a Windows Runtime Component for tesseract.

Url: https://connect.microsoft.com/VisualStudio/feedback/details/777333/add-dllimport-support-for-net-in-windows-phone-8

Nabeelhassan commented 10 years ago

could you guide me what to do about this as i need tesseract in my fyp

charlesw commented 10 years ago

Based on what's written in https://connect.microsoft.com/VisualStudio/feedback/details/777333/add-dllimport-support-for-net-in-windows-phone-8 the quickest path would be to generate a WinPTR component that exposes the Tesseract C and Leptonica APIs like SharpDX has done (see: https://github.com/sharpdx/SharpDX/tree/master/Source/SharpDX.WP8).

Unfortunately I can't really provide a step by step guide for you at this time as I'm not to sure exactly what is going to be required. However If you download the SharpDX source and spend some time tracing through what they've done hopefully you'll be able to replicate the same techniques for tesseract.

If you have any specific questions feel free to ask.

Nabeelhassan commented 10 years ago

i just have to write a wrapper for all the interop classes i.e. baseapi , constants , leptonicaapi marshalhelper and windowslibraryloader ????

the rest needs to be left as it is ??

charlesw commented 10 years ago

If you use the same approach as SharpDX then yes you'd need to write a wrapper in C++/CX to expose the function pointers that are used by the interop classes (baseapi etc) and then update those interop classes to use these function pointers instead of P/Invoke. You will also probably need to make some changes to the tesseract project itself to compile for the ARM architecture.

Nabeelhassan commented 10 years ago

will try my best :)

thank you for your co-operation :) ...

Nabeelhassan commented 10 years ago

please correct if i am wrong ..

i need to use the baseapi.h and baseapi.cpp files for making the windows runtime component ??

Gladiatorza commented 10 years ago

Hi Nabeelyaya

If you get this working, please post your project source here, it would be appreciated.

Thanks

charlesw commented 10 years ago

Yes you'll need to use the baseapi.h file, but not the cpp files directly. Basically I think you'll need to do something like this:

  1. Generate Tesseract and Leptonica as a static library targeting Win Phone 8.
  2. Add a new WinPTR component project to the Tesseract library that references the static libraries. This will need to be implemented using C++/CX (see links above and the SharpDx example).
  3. Update the Interop wrapper classes to use the WinPTR component but only for the WP8 configuration (give me a shout when you get this far and I'll provide some info on how to create a new config).

If I were you I'd probably attempt to just expose the API to get tesseract's version first as a proof of concept rather than trying to do the whole lot in one bit.

Nabeelhassan commented 10 years ago

can generate the static lib files :'( it gives a lot of errors ...

charlesw commented 10 years ago

Sorry should have referred you to https://github.com/charlesw/tesseract-vs2012 which is a fork off https://github.com/jakesays/tesseract-vs2012. This is actually the projects used to build the current dlls provided with the distribution. I just had a quick play and was able to build some lib files targeting x86 just by changing the project configuration. Hopefully it is as easy to add a new configuration targeting Arm architecture, though I couldn't find a lot of documentation on how to do this.

Nabeelhassan commented 10 years ago

can you build arm architecture files for me plx ?

i think in project poperties-> configuration ->linker ->advanced their is an attribute target machine .... you can change it to machine arm ... will it build the required files ???/

charlesw commented 10 years ago

OK I'll give it a shot, no promises though :)

charlesw commented 10 years ago

I've created a new branch (win8) on https://github.com/charlesw/tesseract-vs2012 that contains a first attempt at porting the native tesseract libs over to windows phone 8. Unfortunately there are quite a few api changes that mean these libraries do not build when targeting the Windows Runtime.

Most of the issues seem to originate in leptonica (I haven't gotten as far as tesseract itself) in that it has code to fire up other programs, normally for debugging purposes such as viewing the output if a debug flag is set, which is a no no when it comes to WRT.

I think the next step is to selectively remove this code so leptonica compiles using the _WRT define i added to the project config for instance much like this:

#ifndef _WRT
     // code that should NOT be in WRT build
#endif

Check out my last commit on the mentioned project to see some examples of how to do this. Unfortunately I don't have the time to take this any further at this time so I'll have to leave this up to you.

Good luck!!!

Nabeelhassan commented 10 years ago

untitled this is the error that shows up when trying to open .sln file

what should i do here ?

charlesw commented 10 years ago

What version of vs are you using? If your using VS express you'll need VS Express 2013 for windows not windows desktop. This is required to build apps for windows store and thus windows phone 8. I'm not sure if vs 2012 or earlier can build this project type.

charlesw commented 10 years ago

This could also be another reason http://stackoverflow.com/questions/19413073/is-it-possible-to-create-a-windows-8-store-app-from-visual-studio-2013

Nabeelhassan commented 10 years ago

i have vs 2012.. could you please give me the lib files ... i have made a winrt component and it accessed the version .. but i think it failed becz i had static lib files for x86 /.. a little for help on this issue ?

charlesw commented 10 years ago

Sorry I don't have any libs I can provide you with that will work for Windows phone. As mentioned above you'll need to also make some source changes to get them compiling when targeting Windows Runtime.

IllSc commented 10 years ago

I've found tutorial for BlackBerry10 / iOS. However I doný know how to apply them for Windows Phone. Maybe somebody can help http://www.xitijpatel.com/2012/11/21/compiling-tesseract-for-blackberry-10 tinsuke.wordpress.com/2011/02/17/how-to-cross-compiling-libraries-for-ios-armv6armv7i386/

charlesw commented 10 years ago

Sorry not to sure myself.

AmitBhatnagar24 commented 9 years ago

Does anything change here With Windows10 Universal? Can we use this within a Universal app?

yoisel commented 8 years ago

@AmitBhatnagar24 Nothing really changes with win 10 universal apps, except maybe that it comes with its own OCR API from Microsoft.

By the way I was able to get Tesseract to build as a winrt library by doing a few minor modifications and by removing all its dependencies except leptonica. Compressed images are being decompressed using WIC (Windows Image Component). You can take a look at it here if you are interested: https://github.com/yoisel/tesseract_winrt

At this point it's just a proof of concept, so only the Debug/x86 configuration builds, but I do intend to complete the task in near future.

charlesw commented 8 years ago

Sweet :) On 23/10/2015 8:08 am, "Yoisel Melis" notifications@github.com wrote:

@AmitBhatnagar24 https://github.com/AmitBhatnagar24 Nothing really changes with win 10 universal apps, except maybe that it comes with its own OCR API from Microsoft.

By the way I was able to get Tesseract to build as a winrt library by doing a few minor modifications and by removing all its dependencies except leptonica). Compressed images are being decompressed using WIC (Windows Image Component). You can take a look at it here if you are interested: https://github.com/yoisel/tesseract_winrt

At this point it's just a proof of concept, so only the Debug/x86 configuration builds, but I do intent to complete the task in near future.

— Reply to this email directly or view it on GitHub https://github.com/charlesw/tesseract/issues/53#issuecomment-150356777.