dlareklami / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

PageIterator::BoundingBoxInternal() sometimes gives wrong vertical coordinates #1144

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
PageIterator::BoundingBoxInternal() sometimes returns 0 values for top and 
bottom for RIL_WORD while left and right are always correct. When the first 
word with 0 values occurs, all following words in the page also have top == 0 
and bottom == 0. Repeating recognition on the same page may result in different 
results.

I am using the C API via Java to get bounding box information for recognized 
words. The error is not due to the C-Java-binding, I verified that the zeros 
also occur from within Tesseract.

I attached a patch file that solves this issue:

The wrong values are caused by implicitly casting from inT16 to int in 
PageIterator::BoundingBoxInternal() when converting the coordinates to top-down.

For some reason calculating the left and right coordinates is done by a 
static_cast<int>(box.left()) (same with right) but with top and bottom it is 
not. Adding the static_cast for top and bottom solves the issue.

Original issue reported on code.google.com by p.vorb...@gmail.com on 19 Apr 2014 at 11:21

Attachments:

GoogleCodeExporter commented 9 years ago
I just noticed that the errors I reported are not fixed by my patch. It seems 
that for some other reason the coordinates were correct during my tests after 
applying the patch.

Original comment by p.vorb...@gmail.com on 23 Apr 2014 at 7:20

GoogleCodeExporter commented 9 years ago
Can you provide simple test case with (input) image, so we can have a look on 
it?

Original comment by zde...@gmail.com on 24 Apr 2014 at 8:15

GoogleCodeExporter commented 9 years ago
Sorry, it seems I didn't receive an email notification.

I attached an image that (sometimes) causes the error.

I get the errors through my own custom Java wrapper, which isn't released yet. 
Today I tried to get the error through C++ code, but I couldn't. Here's the 
code I used:

    Pix *image = pixRead("fries_example.png");
    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
    api->Init(NULL, "deu-frak");
    api->SetImage(image);
    api->Recognize(0);
    tesseract::ResultIterator* ri = api->GetIterator();
    tesseract::PageIteratorLevel level = tesseract::RIL_WORD;

    if (ri != 0) {
        do {
            const char* word = ri->GetUTF8Text(level);
            float conf = ri->Confidence(level);
            int x1, y1, x2, y2;
            ri->BoundingBox(level, &x1, &y1, &x2, &y2);
            printf("word: '%s';  BoundingBox: %d,%d,%d,%d;\n",
                word, x1, y1, x2, y2);
            delete[] word;
        } while (ri->Next(level));
    }

My Java code is basically doing the same using the C API.

Since I didn't get the error in my C++ program, it has got something to do with 
the Java wrapper I am using or the way I am initializing the recognition 
process. I'll look into the Java code again and let you know when I find the 
problem.

Original comment by p.vorb...@gmail.com on 2 May 2014 at 10:37

Attachments:

GoogleCodeExporter commented 9 years ago
The error was a concurrency issue in Java. I initialized the API in one thread 
and then iterated over the recognition results in another thread, which caused 
the problem. I still don't know why exactly the y coordinates (and no other) 
were affected, but using the same thread to initialize and iterate gives me 
correct results.

This issue may be closed.

Original comment by p.vorb...@gmail.com on 2 May 2014 at 10:52

GoogleCodeExporter commented 9 years ago

Original comment by zde...@gmail.com on 2 May 2014 at 3:26