akorentlab / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Wrong assumptions in make_single_row() #446

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Run Tesseract for recognition of a single character image in the 
PSM_SINGLE_CHAR segmentation mode. Character should contain holes, such as "8", 
"6", "a", etc.

What is the expected output? What do you see instead?
Expected: Correctly recognized "8" or "6" or "a"
Instead: Recognized ":" or "." or "."

Please use labels and text to provide additional information.
This issue emerges when the PSM_SINGLE_CHAR seg mode is in effect. The culprit 
is the following block of code:
if (block->blobs.singleton()) {
    blob_it.move_to_first();
    float size = MakeRowFromSubBlobs(block, blob_it.data()->cblob(), &row_it);
    if (size > block->line_size)
        block->line_size = size;
}

In case of a single blob per block it extracts blob's child outlines into a new 
blob, packing the old blob and the new one into the two new rows. 

For characters like those above, child outlines represent inner connected 
component's contours. So the result of make_single_row() is the row with blob 
consisting of external contour outlines plus the row with internal contour 
outlines. Later CleanupSingleRowResult() discards the former row, for 
understandable reasons.

I can imagine the reasons for the above code block to exist, but there's a need 
to correct the control paths for the case of PSM_SINGLE_CHAR.

P.S.: I don't know if you care much, but I also found that function comments 
for fill_buckets() and empty_buckets() are irrelevant (cloned).

Original issue reported on code.google.com by daemons2...@gmail.com on 9 Feb 2011 at 8:00

GoogleCodeExporter commented 9 years ago
I was not able to repro it with '8'(attached). Repros fine with 'a'.

tesseract.exe ..\..\8.bmp out -psm 10

Original comment by max.mar...@gmail.com on 27 Feb 2011 at 4:33

Attachments:

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I don't know why you can't reproduce the issue with "8" (maybe you should crop 
the image down) - it's exciting and illustrative )) However I hope my 
explanations shed enough light on the reasons and on what can be done to fix 
the issue.

Also attaching an image to try out

Regards,
Dmitry

Original comment by daemons2...@gmail.com on 3 Mar 2011 at 5:28

Attachments:

GoogleCodeExporter commented 9 years ago
I was able to reproduce it with attached image.

Original comment by zde...@gmail.com on 24 Feb 2012 at 9:06

Attachments:

GoogleCodeExporter commented 9 years ago
@Dmitry,
Using tl_orig_13.tif tested in r-679 - vide attached file.Only "KOA" displayed 
as "KIJAI " rest are OK.

Whereas tested in 3.01 result attached. Only "KOA" displayed as "Kuai"-rest are 
ok.
With regards,
-sriranga(79yrs)

Original comment by withbles...@gmail.com on 25 Feb 2012 at 3:17

Attachments:

GoogleCodeExporter commented 9 years ago
Well yes and no.
For single row, single word and single char, it goes through the same code.
If there is only one connected component it makes 2 rows:
the original
the holes of the original (as you observe)

This is just in case the input image is a box containing the required text.
It isn't easy to reliably detect this case just looking at the shape, so it 
uses the confidence from OCR and picks the output from the row with the best 
confidence.

This doesn't work too well where the input is damaged, as it can recognize the 
holes with much more confidence as . or : etc.

I will think about it some more, but as a user of this API, would it be useful 
to have *both* outputs, so you can pick which you like best?

Original comment by theraysm...@gmail.com on 20 Sep 2012 at 11:01

GoogleCodeExporter commented 9 years ago
If I get it right, the automatic inverted text detection should be an option. 
Although it definitely is a must for out-of-the-box users, in many cases it 
fails for API users. There should be either a fixed "black on white" mode (fg 
pixels are always black, bg pixels are always white), or separate "black on 
white" and "white on black" modes.

Original comment by daemons2...@gmail.com on 22 Sep 2012 at 6:30