Closed DorisGM closed 5 years ago
I don't have a good way to do it. As an interesting test, you could try running Firebase's language detection on the output of the English OCR and then run Arabic OCR if it isn't identified as English.
Note that msa
is Malay and not Modern Standard Arabic.
Anyway, the slowness is a normal side effect and not really a bug in this project.
Thanks for your reply, I switched init different language when OCR different language image。 It looks good.
Summary: Decoding is slow when multiple languages are used.Can I dynamically switch languages to decode images? I want to support multi languages but only a language when decode image . Sometime eng or Sometime ara. Not one sentence include many languages.
Steps to reproduce the issue:
Expected result: I want when I init TessBaseApi by eng + ara + msa can fast as only init by eng. Or maybe I need to switch language dynamically by myself when I decode different language image. And If I switch init different language dynamically, whether it will influence decode performance and should I invoke TessBaseApi.clear before I switch.
Actual result: Decoding is slow when multiple languages are used
Tess-two version: 9.0.0
Android version: 7.0.0
Phone/device model: Android TV Amlogic 905X
Phone/device architecture (armeabi, armeabi-v7a, x86, mips, arm64-v8a, x86_64, mips64): arm64-v8a
Link to training data used: https://github.com/tesseract-ocr/tessdata/tree/3.04.00
Link to image used as input: