gali8 / Tesseract-OCR-iOS

Tesseract OCR iOS is a Framework for iOS7+, compiled also for armv7s and arm64.
http://www.nexor.it
MIT License
4.21k stars 948 forks source link

Tesseract doesn't seem to recognize text #153

Closed hhimanshu closed 9 years ago

hhimanshu commented 9 years ago

Hey @dlinsin @BamX and others

I am and to this library and pretty sure I must be doing wrong but not sure what? I am trying to read following shopping receipt walmart_receipt

and my code for that looks like

- (void)viewDidLoad
{
    [super viewDidLoad];

    // Languages are used for recognition (e.g. eng, ita, etc.). Tesseract engine
    // will search for the .traineddata language file in the tessdata directory.
    // For example, specifying "eng+ita" will search for "eng.traineddata" and
    // "ita.traineddata". Cube engine will search for "eng.cube.*" files.
    // See https://code.google.com/p/tesseract-ocr/downloads/list.

    // Create your G8Tesseract object using the initWithLanguage method:
    G8Tesseract *tesseract = [[G8Tesseract alloc] initWithLanguage:@"eng"];

    // Optionaly: You could specify engine to recognize with.
    // G8OCREngineModeTesseractOnly by default. It provides more features and faster
    // than Cube engine. See G8Constants.h for more information.
    //tesseract.engineMode = G8OCREngineModeTesseractOnly;

    // Set up the delegate to receive Tesseract's callbacks.
    // self should respond to TesseractDelegate and implement a
    // "- (BOOL)shouldCancelImageRecognitionForTesseract:(G8Tesseract *)tesseract"
    // method to receive a callback to decide whether or not to interrupt
    // Tesseract before it finishes a recognition.
    tesseract.delegate = self;

    // Optional: Limit the character set Tesseract should try to recognize from
    tesseract.charWhitelist = @"0123456789";

    // This is wrapper for common Tesseract variable kG8ParamTesseditCharWhitelist:
    // [tesseract setVariableValue:@"0123456789" forKey:kG8ParamTesseditCharBlacklist];
    // See G8TesseractParameters.h for a complete list of Tesseract variables

    // Optional: Limit the character set Tesseract should not try to recognize from
    //tesseract.charBlacklist = @"OoZzBbSs";

    // Specify the image Tesseract should recognize on
    tesseract.image = [[UIImage imageNamed:@"walmart_receipt.png"] g8_blackAndWhite];

    // Optional: Limit the area of the image Tesseract should recognize on to a rectangle
    tesseract.rect = CGRectMake(20, 20, 100, 100);

    // Optional: Limit recognition time with a few seconds
    tesseract.maximumRecognitionTime = 2.0;

    // Start the recognition
    [tesseract recognize];

    // Retrieve the recognized text
    NSLog(@"Text:%@", [tesseract recognizedText]);

    // You could retrieve more information about recognized text with that methods:
    NSArray *characterBoxes = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelSymbol];
    NSArray *paragraphs = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelParagraph];
    NSArray *characterChoices = tesseract.characterChoices;
    UIImage *imageWithBlocks = [tesseract imageWithBlocks:characterBoxes drawText:YES thresholded:NO];
}

and the output that I see is

2015-03-01 10:10:00.442 testImage[45083:70b] Text: 13
53 142  11

I don't know what that means.

Can you please tell me how can I read the entire text from this image?

I have also uploaded the project if that would help you

hhimanshu commented 9 years ago

I tried dumping more data and got more confused

// You could retrieve more information about recognized text with that methods:
    NSArray *characterBoxes = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelSymbol];
    NSLog(@"characterBoxes:%@", characterBoxes);

    NSArray *paragraphs = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelParagraph];
    NSLog(@"paragraphs:%@", paragraphs);

    NSArray *characterChoices = tesseract.characterChoices;
    NSLog(@"characterChoices:%@", characterChoices);

and the output is

2015-03-01 12:38:04.888 testImage[45600:70b] Text: 13
53 142  11

2015-03-01 12:38:04.889 testImage[45600:70b] characterBoxes:(
    "(2.56%) ' '",
    "(74.74%) '1'",
    "(69.03%) '3'",
    "(89.08%) '5'",
    "(72.80%) '3'",
    "(22.93%) ' '",
    "(78.33%) '1'",
    "(67.23%) '4'",
    "(70.94%) '2'",
    "(15.52%) ' '",
    "(80.01%) '1'",
    "(68.51%) '1'"
)
2015-03-01 12:38:04.890 testImage[45600:70b] paragraphs:(
    "(13.67%) ' 13\n53 142  11\n\n'"
)
2015-03-01 12:38:04.890 testImage[45600:70b] characterChoices:(
        (
        "(2.56%) ' '"
    ),
        (
        "(74.74%) '1'"
    ),
        (
        "(69.03%) '3'"
    ),
        (
        "(89.08%) '5'"
    ),
        (
        "(72.80%) '3'"
    ),
        (
        "(22.93%) ' '"
    ),
        (
        "(78.33%) '1'"
    ),
        (
        "(67.23%) '4'"
    ),
        (
        "(70.94%) '2'"
    ),
        (
        "(5.45%) ' '"
    ),
        (
        "(80.01%) '1'"
    ),
        (
        "(68.51%) '1'"
    )
)

But this is no where giving me the data on the receipt I mentioned. Please help

kevincon commented 9 years ago

I think you would have solved this problem yourself had you actually read the comments in the example code you copied. When in doubt, please please please read the comments! They're there for a reason!

Look at these two lines:

// Optional: Limit the character set Tesseract should try to recognize from
tesseract.charWhitelist = @"0123456789";
// Optional: Limit the area of the image Tesseract should recognize on to a rectangle
tesseract.rect = CGRectMake(20, 20, 100, 100);

The first line is restricting the recognition to only recognize the numbers 0-9 because it's setting a whitelist. Any characters not in that whitelist will be ignored.

The second line is restricting the recognition to only recognize in a small window of the Walmart receipt described by the rectangle with origin location (20, 20) with a width of 100 and a height of 100.

If you comment out those two lines and re-run your app, you'll see the following printed out in the Xcode console:

2015-03-01 16:46:01.248 testImage[30569:2175050] Text:Walmart '
Save money. Live better. .
waImart
MANAGER DALE STEWEKT
(501) 328 - 9570
5T# 009% ovx 00001929 TE» 18 TR# 09137
MUNEHKIN CAR CLING SHADES 4.50 T
0 1928372837
INFANTINO INF S-IN-1 CARRIER 30.00 T
FIOgZEERlPRIZésgABY MIRROR 20 00 T
001928372562 '
GERBER CLOTH DIAPER lZ—PK 11.94 T
00992337253
GERBER gNESIES NEWBORN 0’3 7.94 T
GEOOKR7 827375 NEWBORN 0’3 7 94
 BURP CLDTHS 4*PK 9-94 T
05347§§2910 '
GERBER gNESIES NEWBORN 0’3 PINK 7_94 T
GEROERSBATENSET 4*PIECE B 24
05716281920, ' T
GERBER SLEEPN PLAY JUMPSUITS ZPK 9_§4

This is the best that Tesseract can do in recognizing text from your image unless you preprocess your image to make it easier for Tesseract to recognize AND/OR create a custom font/language file for Tesseract that you have trained on the font used in these Walmart receipts. Both of these tasks are outside of the scope of this library, but you should be able to search Google for tutorials on training custom language files for Tesseract, and we have a section in our Wiki to assist with ideas for preprocessing images: https://github.com/gali8/Tesseract-OCR-iOS/wiki/Tips-for-Improving-OCR-Results

kevincon commented 9 years ago

In fact, if you further comment out the following line:

// Optional: Limit recognition time with a few seconds
tesseract.maximumRecognitionTime = 2.0;

and re-run the app, you get this result which contains more of the text of the receipt:

2015-03-01 16:57:45.981 testImage[33045:2191003] Text:Walma rt '
Save money. Live better. .
waImart
MANAGER DALE STEWEKT
(501) 328 - 9570
5T# 009% ovx 00001929 TE» 18 TR# 09137
MUNEHKIN CAR CLING SHADES 4.50 T
0 1928372837
INFANTINO INF S-IN-1 CARRIER 30.00 T
FIOgZEERlPRIZésgABY MIRROR 20 00 T
001928372562 '
GERBER CLOTH DIAPER lZ—PK 11.94 T
00992337253
GERBER gNESIES NEWBORN 0’3 7.94 T
GEOOKR7 827375 NEWBORN 0’3 7 94
 BURP CLDTHS 4*PK 9.94 T
05347§§2910 '
GERBER gNESIES NEWBORN 0’3 PINK 7_94 T
GEROERSBATENSET 4*PIECE B 24
05716281920, ' T
GERBER SLEEPN PLAY JUMPSUITS ZPK 9_§4 T
Gzo‘ééfilsigé‘éfiww auwsum m 9 94 T
“5098788108..  .
55182518§74 '
GERBER INFANT GOWNS 2-PK 8.24 T
ag‘é‘éfifiséfiflifi <30sz m a 24
00983726362 ' T
FADED GLORY NEWBORN BODYSUIT 2.00 T
FAggDZ7E7gzlewBORN BODVSUIT z oo
FAOO71§923921EWBORN BODYSUIT 2'00 T
8593828§§3§ - T
FADED GLORY NEWBORN BODYSUIT 2.00 T
FAgggzgstRglaEb/BDRN PANTS 2 00 T
00710239392 '
FADED GLORY NEWBORN PANTS 2.00 T
FAggggégzngsaEWBORN PANTS 2 00 T
00774932929 '
FADED GLORY NEWBORN PANTS 2.00 T
00719283920
GARANIMALS TURTLE VIBRATE TOY 5_OO T
angfik‘ifliizgfifinn mm 2 00 T
00183923839 '
GARANIMALS CHIME ALONG 4.00 T
INOng'TIEEEEBST RATTLES 3 00 T
00182733938 '
GRACO DIGITAL MONITOR Z UNITS 60.00 T
0500232 339220815315 1 00 T
55.3% .55.; -
OELTA 1 *PK HANGERS 1.00 T
00928392398
DELTA 10*PK HANGERS 1.00 T
00900839283
DELTA 1 *PK HANGERS 1.00 T
00928932983
CHILD g MINE SLEEP&PLAY JUMPER 7.00 T
0054 232399
(HILD g MINE SLEEP&PLAY JUMPER 7.00 T
0059 398329
(HILD O MINE SLEEP&PLAY JUMPER 7.00 T
0058983R987
CHILD g MINE SLEEP&PLAY JUMPER 7.00 T
0059 379384
CHILD 0 MINE SLEEP&PLAY JUMPER 7.00 T
CH99D27$3EIZE SLEEP&PLAY JUMPER 7 ()0
00593374928 ' T
CHILD O MINE DRESS SET 5.75 T
04002159830343 DRESS SET 5 75 T
 DRESS SET 5-75
0029g734923 ' T
FADED GLORY NEWBORN DRESS SET 10.00 T
6083E29g82274P0KEY PUPPY 3 99 T
00289823738 '
GOLDEN BOOKS TURTLE SHELL 3.99 T
00919832409
SUBTOTAL 24 .47
TAX 1 8.371% 1 .84
TOTAL 25 . 1
DEBIT TEND 260.31
CHANDE DhE 0.00
EFT DEBIT PAY FROM er RY
AESOgNT : {8‘83
2 . 1 TOTAL PURCHASE

02/1252011’ 15:22:32
# ITEMS SOLD 41
TE! 2127 9170 9490 5255 INS
‘mm"mmmmmmmmmmmmmmmmmmmWWWHWmm
Iax Prup in stnv! It Jacksnn Hauitt
Ind 83 thick [lihiflfi at Ualnurt

This is because this line limits how long Tesseract can spend recognizing on the image, so by commenting it out, you let Tesseract take as long as it needs to.

hhimanshu commented 9 years ago

Thanks a lot for your help, very much appreciated