rmtheis / tess-two

Fork of Tesseract Tools for Android
Apache License 2.0
3.76k stars 1.38k forks source link

OCR processing Accuracy issue #222

Closed kirantpatil closed 7 years ago

kirantpatil commented 7 years ago

Summary: To get MICR, I used the image which is attached below.

From Android using tess-two, I am not getting accuracy.

But the same image is giving 100% accuracy, if I process it from Ubuntu Linux.

What could be the reason ?

Steps to reproduce the issue:

  1. Use https://github.com/GautamGupta/Simple-Android-OCR project get image and process for MICR
  2. Install Tesseract on ubuntu and perform below command to extract check the MICR from image
  3. $ tesseract ocr.jpg dena_samsung_ocr_psm_default -l mcr
  4. $ cat dena_samsung_ocr_psm_default.txt @120 18 2@ 5600 1800 6[ 1 1 $000000000 5000$

Expected result: @120 18 2@ 5600 1800 6[ 1 1 $000000000 5000$

Actual result: @120 18 2@ 5600 1800 601 1 1 80000000000 5000$

Tess-two version: 8.0.0

Android version: 4.4.2

Phone/device model: GT-19192

Phone/device architecture (armeabi, armeabi-v7a, x86, mips, arm64-v8a, x86_64, mips64):

Link to training data used: http://www.devscope.net/LinkClick.aspx?fileticket=IClT4oNw3Hg%3d&tabid=782&portalid=1&mid=1778

Link to image used as input: ocr

kirantpatil commented 7 years ago

Ubuntu details:

$ tesseract --version tesseract 3.05.01 leptonica-1.74.4 libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8

rmtheis commented 7 years ago

Hmm, the Ubuntu version you're using is slightly ahead of the tess-two version of Tesseract. Version 8.0.0 of this project was last updated in d33043af19c52f36e68ba9e79675f54f3e16fbbd which is Tesseract 3.05.00. If you build from that commit on Ubuntu, you should get a matching result.

So this issue is due to a version mismatch and not a bug. Thanks for the report.