AnantLabs / tesseractdotnet

Automatically exported from code.google.com/p/tesseractdotnet
0 stars 0 forks source link

.net 3 Confidence is always 0 #19

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

I am using the vs 3 .net wrapper.
When I run the function Recognize it ocrs the image fine and I can get
the string.
I need the confidence level of each character, but it is always 0.
What am I doing wrong?

        Dim image As New Bitmap("C:\MyImage.tif")
        Dim ocr As New TesseractProcessor

        ocr.Init(Nothing, "eng", False)
        Console.WriteLine(ocr.Recognize(image))

        ocr.InitForAnalysePage()
        ocr.SetVariable("tessedit_thresholding_method", "1")
        ocr.SetVariable("save_best_choices", "T")

        Dim doc As DocumentLayout = ocr.AnalyseLayout(image)
        For Each blk As OCR.TesseractWrapper.Block In doc.Blocks
            Console.WriteLine("Block Confidence: " & blk.Confidence)

            For Each para As Paragraph In blk.Paragraphs
                Console.WriteLine("para Confidence: " &
para.Confidence)

                For Each ln As TextLine In para.Lines
                    Console.WriteLine("ln Confidence: " &
ln.Confidence)

                    For Each wrd As Word In ln.Words
                        Console.WriteLine("wrd Confidence: " &
wrd.Confidence)
                        Console.WriteLine("wrd Text: " & wrd.Text)

                        For Each ch As Character In wrd.CharList
                            Console.WriteLine("V:" & ch.Value)
                            Console.WriteLine("C:" & ch.Confidence)
                        Next

                    Next

                Next
            Next
        Next

What is the expected output? What do you see instead?
The confidence is always zero.

What version of the product are you using? On what operating system?
tesseract engine 3.x .net wrapper v1.0 RC2

Please provide any additional information below.

Original issue reported on code.google.com by curtisjo...@gmail.com on 15 Mar 2012 at 2:19

GoogleCodeExporter commented 9 years ago
Dear curtisjohnston,

Sorry i cannot post all modifications, but I hope that some info below will be 
able to help you.

In AnalyseLayout function, it only collect bounding-rect of recongized 
layout-item.
If you really want to get confidence of recognized text, you can add-and-modify 
some lines code as belows:

- 1. after call recognize() step, get and store the result iterator pointer as 
in Recognize(...) method.
- 2. add more method CollectRecongnizedResults(ResultIteratorBase 
resultIterator) in RecognitionItem
- 3. use tesseract api directly

* in your case ResultIteratorBase is ResutltIterator (ref. to baseapi.h for 
GetIterator() method).
* all code (1)+(2) I have just make a wrapper on .net; so you have to port them 
back to c++, i think it's not problem to you.

(1)
public virtual String Recognize(Image image, ref DocumentLayout doc)
        {
            _collectResultDetails = true;
            try
            {
                // clear document
                if (doc == null)
                    doc = new DocumentLayout();
                else
                    doc.Blocks.Clear();

                String txt = Recognize(image); // 

                // collect details here
                if (_collectResultDetails && _resultIterator != null)
                    doc.CollectRecongnizedResults(_resultIterator);

                return txt;
            }
            catch
            {
                throw;
            }
            finally
            {
                _collectResultDetails = false;
                DisposeResultDetailCollector();
            }
        }

(2)
public virtual void CollectRecongnizedResults(ResultIteratorBase resultIterator)
        {
            ePageIteratorLevel curLevel = this.GetPageIteratorLevel();

            // recongnized confidence
            this.Confidence = 0.01 * resultIterator.GetConfidence(curLevel);

            // get specific features
            switch (_pageLevel)
            {
                case ePageIteratorLevel.RIL_SYMBOL:
                    String txt = resultIterator.GetUTF8Text(curLevel);
                    (this as Character).Value = (txt != null && txt.Length > 0 ? txt[0] : '$');
                    (this as Character).IsSuperscript = resultIterator.SymbolIsSuperscript();
                    (this as Character).IsSubscript = resultIterator.SymbolIsSubscript();
                    (this as Character).IsDropcap = resultIterator.SymbolIsDropcap();
                    break;
                case ePageIteratorLevel.RIL_WORD:
                    (this as Word).Text = resultIterator.GetUTF8Text(curLevel);
                    RecognitionFont recognizedFont = resultIterator.GetWordFontAttributes();
                    (this as Word).RecognizedFont = recognizedFont;
                    (this as Word).IsNumeric = resultIterator.WordIsNumeric();
                    (this as Word).Direction = resultIterator.WordDirection();
                    break;
                default:
                    break;
            }

            RecognitionItem child = this.CreateChild();
            if (child == null) // it is lowest level
            {
                resultIterator.GetBoundingBox(
                    this.GetPageIteratorLevel(),
                    ref Left, ref Top, ref Right, ref Bottom);
                return;
            }

            ePageIteratorLevel nextLevel = this.GetNextPageIteratorLevel();

            resultIterator.GetBoundingBox(
                curLevel, ref Left, ref Top, ref Right, ref Bottom);

            if (resultIterator.IsAtBeginningOf(nextLevel))
            {
                // get the first item
                child.CollectRecongnizedResults(resultIterator);
                this.AddItem(child);

                if (resultIterator.IsAtFinalElement(curLevel, nextLevel))
                    return;

                // get remaining items
                while (resultIterator.Next(nextLevel))
                {
                    child = this.CreateChild();
                    child.CollectRecongnizedResults(resultIterator);
                    this.AddItem(child);

                    if (resultIterator.IsAtFinalElement(curLevel, nextLevel))
                        break;
                }
            }
        }

(3)
String* OCRProcessor::Recognize(TessBaseAPI* api, Pix* pix)
{
    if (api == null || pix == null)
        return null;

    // dispose result collector if possible
    this->DisposeResultDetailCollector();

    api->SetImage(pix);

    bool succeed = api->Recognize(null) >= 0;

    // if succeed and do collect result details
    if (succeed && _collectResultDetails)
    {
        _resultIterator = 
            new ResultIteratorWrapper(api->GetIterator());
    }

    char* text = null;
    String* result = null;
    try
    {
        text = api->GetUTF8Text();
        result = Helper::ToUTF8String(text);
    }
    catch (System::Exception* exp)
    {
        throw exp;
    }
    __finally
    {
        if (text != null)
        {
            delete[] text;
            text = null;
        }
    }

    return result;
}

Original comment by congnguy...@gmail.com on 2 Apr 2012 at 4:34

GoogleCodeExporter commented 9 years ago
I have a multipage tiff file, I want that every pages will be OCRd by 
tesseract.dll. I am using tesseract.dll 3.01 and C#.net. I have also done the 
following,
_ocrProcessor.SetVariable("tessedit_page_number", "-1");

but the dll always OCRd the first page.....please help

Original comment by subhajit...@gmail.com on 5 Oct 2012 at 12:56