rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

Getting different results when using tesseract on mobile vs on PC using Python. #260

Closed SoniKarsh closed 5 years ago

SoniKarsh commented 5 years ago

Summary: I have been developing an app which detects text and recognize detected texts using opencv, Tesseract, android. But after many attempts i was facing critical issues like i wasnt getting desired results as i was getting when running same image in python or website called newocr.com . So i was just curious to know if there is any other way to improve accuacy. because without much effort or many image processing pytesseract providing pretty good result.

Steps to reproduce the issue:

  1. Before passing a raw image i am applying OTSU Binarization.
  2. image after binarization. https://i.stack.imgur.com/8UIBw.jpg

Expected result: Tillamook

Actual result: Tillfaoofk

Tess-two version: 9.0 Latest

Android version: Any

Phone/device model: Any

Phone/device architecture (armeabi, armeabi-v7a, x86, mips, arm64-v8a, x86_64, mips64): armeabi-v7a, arm64-v8a

Link to training data used: https://github.com/tesseract-ocr/tessdata

Link to image used as input: https://i.stack.imgur.com/8UIBw.jpg

StackOverflow Questions https://stackoverflow.com/questions/55119596/not-getting-efficient-result-from-tesseract-ocr-as-newocr-producing

rmtheis commented 5 years ago

It's likely that your Python wrapper is using a different version. Check out this site for accuracy suggestions: https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality

SoniKarsh commented 5 years ago

Thanks for your reply though i have tried applying the above image improving algorithms already still no luck but still i will update you soon after applying these operations on my final image.