shimat / opencvsharp

OpenCV wrapper for .NET
Apache License 2.0
5.39k stars 1.15k forks source link

OCR Tesseract: "Warning: Invalid resolution 0 dpi. Using 70 instead." #881

Closed Blightbuster closed 3 years ago

Blightbuster commented 4 years ago

Summary of your issue

When performing ocr on a captured screenshot, I get the warning "Warning: Invalid resolution 0 dpi. Using 70 instead.". I capture the screenshot as System.Drawing.Bitmap which does contain the correct dpi. When converting the bitmap to OpenCvSharp.Mat and then performing ocr on the mat it prints the warning into the console.

Environment

Im using OpenCVSharp4 4.2.0.20200208 and OpenCVSharp4.runtime.win 4.2.0.20200208

Example code:

private Bitmap GetScreenshot(int x, int y)
{
    var bmp = new Bitmap(200, 50, PixelFormat.Format24bppRgb);

    var gfx = Graphics.FromImage(bmp);
    var size = new System.Drawing.Size(200, 50);

    gfx.CopyFromScreen(0, 0, 0 , 0, size);
    return bmp;
}

private string OCR(Bitmap bmp)
{
    var text = "";

    using (var mat = BitmapConverter.ToMat(bmp))
    {
        _ocrTesseract.Run(mat, out text, out _, out _, out _);
    }

    return text;
}

Output:

Warning: Invalid resolution 0 dpi. Using 70 instead.

What did you intend to be?

Id like to either some how tell tesseract the dpi of the bitmap or mute the warning since 70 dpi is close enought to the actual dpi.

shimat commented 4 years ago

I couldn't find a way to do it either. 😢

It might be possible to do that by rewriting the OpenCV implementation. https://github.com/opencv/opencv_contrib/blob/master/modules/text/src/ocr_tesseract.cpp#L206

tess.SetImage((uchar*)image.data, image.size().width, image.size().height, image.channels(), image.step1());

Pix* pix = tess.GetInputImage();
pixSetXRes(*pix, 70);
pixSetYRes(*pix, 70);

tess.Recognize(0);
...
stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stone89son commented 3 years ago

Summary of your issue

When performing ocr on a captured screenshot, I get the warning "Warning: Invalid resolution 0 dpi. Using 70 instead.". I capture the screenshot as System.Drawing.Bitmap which does contain the correct dpi. When converting the bitmap to OpenCvSharp.Mat and then performing ocr on the mat it prints the warning into the console.

Environment

Im using OpenCVSharp4 4.2.0.20200208 and OpenCVSharp4.runtime.win 4.2.0.20200208

Example code:

private Bitmap GetScreenshot(int x, int y)
{
    var bmp = new Bitmap(200, 50, PixelFormat.Format24bppRgb);

    var gfx = Graphics.FromImage(bmp);
    var size = new System.Drawing.Size(200, 50);

    gfx.CopyFromScreen(0, 0, 0 , 0, size);
    return bmp;
}

private string OCR(Bitmap bmp)
{
    var text = "";

    using (var mat = BitmapConverter.ToMat(bmp))
    {
        _ocrTesseract.Run(mat, out text, out _, out _, out _);
    }

    return text;
}

Output:

Warning: Invalid resolution 0 dpi. Using 70 instead.

What did you intend to be?

Id like to either some how tell tesseract the dpi of the bitmap or mute the warning since 70 dpi is close enought to the actual dpi. this problem not opencvsharp, tesseract of problem, set dpi to 300 to fix. ref link: https://www.google.com/search?q=tesseract+dpi+parameter&rlz=1C1FQRR_enJP969JP969&sxsrf=AOaemvK6LtKm-reUcSqkol3KNGPls4TxBA%3A1632322030603&ei=7kFLYf2hJLKC1e8Ph7eNkAM&oq=dpi+tesseract+&gs_lcp=Cgdnd3Mtd2l6EAMYAjIGCAAQFhAeMgYIABAWEB4yBggAEBYQHjIGCAAQFhAeOgcIABBHELADSgQIQRgAUNUIWNUIYJQkaABwA3gAgAGvA4gBrwOSAQM0LTGYAQCgAQHIAQjAAQE&sclient=gws-wiz

n0099 commented 1 year ago

After update libtesseract-dev to 5.3.0, this warning will no longer be shown, perhaps now they will try to guess the dpi based on the size of characters' font in image: https://github.com/tesseract-ocr/tesseract/issues/1702#issuecomment-991305751

The current Tesseract release 5.0.0 tries to guess the correct resolution if there is no explicit information from the image file.