rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

Difference between using this library and using Tesseract through NDK #203

Closed gusavila92 closed 7 years ago

gusavila92 commented 7 years ago

Hi.

I'm building and Android OCR app, but I haven't decided yet whether to use this library or to use Tesseract through the Android NDK. I'm an Android developer but I haven't used the NDK, and that's why I don't understand the difference. Can you tell me? A short description for beginners will be fine. Thanks.

JavadBadirkhanli commented 7 years ago

Hi! If you use Tesseract official sources, you also need build leptonica, libjpeg,libpng and libz shared libraries. For beginners it's not easy for building all of these libraries. When it comes to tess-two, i'm using this source for several months. This is the best OCR source for Android platform.

HughJeffner commented 7 years ago

Would the resulting apk be any smaller if manually built instead of pulling in as a dependency? It adds about 20MB in size to my app (not counting the training file). I am guessing proguard doesn't touch the ndk code so it isn't going to strip unused functions etc so the binaries will still be full size.

Robyer commented 7 years ago

@HughJeffner No, it won't be smaller when manually built. But if APK size is your concern, you can make it smaller by splitting your app's APK by CPU architecture - https://developer.android.com/studio/build/configure-apk-splits.html#configure-abi-split

rmtheis commented 7 years ago

@gusavila92 To clarify, this library does use the Android NDK. Most people use this library as a pre-built external dependency, though, and as a result they don't need to use the NDK to build it themselves.

alexcohn commented 5 years ago

@rmtheis I wonder, what advantage do you see in having the separate shared libraries for this project? I believe that a monolythic libtesseract.so would be easier to work with, especially for people who use it as prebuilt.

rmtheis commented 5 years ago

@alexcohn The one advantage I can think of is that developers can remove shared library files they're not using in order to reduce overall app size.

Most users point their app module build.gradle file to use the AAR file hosted on Bintray. If I understand you correctly, when you're saying prebuilt you're referring to directly using the *.so files from your own app's C++ code.

alexcohn commented 5 years ago

Downloading AAR from bintray is definitely an example of prebuilt. People may use these prebuilt libraries from com.googlecode.leptonica.android and com.googlecode.tesseract.android Java packages via JNI, or via C++ APIs, or even mix the two approaches.

At any rate, they won't use libjpgt.so and libpngt.so directly. Worse, if their an happens to depend on libjpeg, explicitly or via inheritance through some other AAR, there may be unhappy linker collisions.

Therefore, I have changed png and jpeg libraries to STATIC. I have also built static libs for leptronica and tesseract, so now I can also build a static Android command-line executable, to resolve https://github.com/tesseract-ocr/tesseract/issues/1393.