tasen / leptonica

Automatically exported from code.google.com/p/leptonica
0 stars 0 forks source link

recogSplitIntoCharacters all components removed #112

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Splitting the image into individual characters (recogSplitIntoCharacters)
2. In console write "all components removed"

What is the expected output? What do you see instead?
split the image but this does not occur

What version of the product are you using? On what operating system?
windows xp. leptonica-1.72

Please provide any additional information below.
    PIX       *pix;
    BOXA      *boxa;
    PIXA      *pixa;
    NUMA      *numa;
    L_RECOG   *recog;
    SARRAY    *sa;
    char      *fname;
    char      buf[250];
    l_int32   n, i, result;
    recog = recogCreate(100, 100, L_USE_ALL, 128, 1);
    sa = getSortedPathnamesInDirectory("C:/recog/GOSTdigits", "tif", 0, 0);
    n = sarrayGetCount(sa);
    for (i = 0; i < n; i++)
    {
        fname = sarrayGetString(sa, i, L_NOCOPY);
        pix = pixRead(fname);
            if (pix == NULL) continue;

            itoa(i, buf, 10);
            pixAddText(pix, buf);
            recogTrainLabelled(recog, pix, NULL, buf, 0, 1);
        pixDestroy(&pix);
    }
    recogTrainingFinished(recog, 0);
    sarrayDestroy(&sa);

    pix = pixRead("C:/recog/test3.tif");
    result = recogSplitIntoCharacters(recog, pix, 5, 5, &boxa, &pixa, &numa, 1);

    printf("result = %d\n", result);
    if (!boxa || !pixa || (result != 0))
        printf("Error in split\n");
    else
        printf("Split ok\n");

Original issue reported on code.google.com by V7Lu...@gmail.com on 22 Aug 2015 at 9:46

Attachments:

GoogleCodeExporter commented 8 years ago
Thank you for this bug report.  I have verified the problem and will get back 
to you within a week.

  -- Dan

Original comment by dan.bloo...@gmail.com on 22 Aug 2015 at 10:54

GoogleCodeExporter commented 8 years ago
The problem is due to the default value MinFillFactor used in 
recogPreSplittingFilter().  This value is 0.25, but that is much too large, and 
causes all the thin characters here to be discarded!

A trivial fix is to reduce it, say, to 0.10.

A better solution is to make it settable, either in recogCreate() or in a new 
function such as recogSetSplitParams().

Original comment by dan.bloo...@gmail.com on 22 Aug 2015 at 11:19