nguyenq / tess4j

Java JNA wrapper for Tesseract OCR API
Apache License 2.0
1.6k stars 373 forks source link

Error calculating angle in ImageDeskew #230

Closed peterkronenberg closed 1 year ago

peterkronenberg commented 2 years ago

I'm seeing an error in the ImageDeskew routine. The below sample code shows a rotation of -6.8 (the unredacted version shows -10) on the attached file even though it should be 0. Any idea why it’s not calculating correctly?

It seems to happen on somewhat sparse images like this, which understably makes it harder to figure out the orientation. I'm wondering if anything can be done to make it more accurate (note: sample file is a tif file, which I couldn't attach, so I zipped it)

public class GetAngle {

    private static double getAngle(Path sourceFile) throws IOException {
        BufferedImage bi = ImageIO.read(sourceFile.toFile());
        ImageDeskew id = new ImageDeskew(bi);
        double angle = id.getSkewAngle();
        if (angle < 1.0D && angle > -1.0D) {
            angle = 0.0D;
        } else {
            System.out.println("*** angle: " + angle);
        }

        return angle;
    }

    public static void main(String[] args) throws IOException {
        Path path = Paths.get( "/testFiles", "sample rotated image - Redacted.tif");
        System.out.println("*** path: " + path);
        System.out.println("*** getAngle: " + getAngle(path));

    }
}

sample rotated image - Redacted.zip

nguyenq commented 2 years ago

While we're investigating this issue, can you try the Leptonica methods that determine the skew angles? If they yield more consistent and accurate results, you may want to go that route; however, the image format conversion, Java BufferedImage to Leptonica Pix and back, will incur some overheads. Please do some analysis, and submit a PR if needed. Thanks.

http://tess4j.sourceforge.net/docs/index.html

nguyenq commented 2 years ago

Could it be that the image has some invisible artifacts (lines) that skewed the results?

The existing Java native method has been in use ever since the library inception, and no one has complained about it.

peterkronenberg commented 2 years ago

I attached the file. No lines that I could see. Again, the document is not skewed at all. Not sure if anyone has ever tried it on an image like that. Do you see the same results? I can try the Leptonica library. What exactly is that for? I see the findSkew() method. I assume that pangle is the skew angle it finds. But what is pconf?

nguyenq commented 2 years ago

The lines may be invisible to human eyes.

Leptonica is the image processing library that Tesseract directly depends on. You will need to consult its documentation for usage.

https://tpgit.github.io/Leptonica/skew_8c.html

peterkronenberg commented 2 years ago

I don't think that's likely. Are you able to try it and determine if this is a bug? I thought this was a place I could get support

nguyenq commented 2 years ago

I tried your image in VietOCR gui. Deskewing the entire image did incorrectly skew it. If I split it in half top/bottom and trim empty spaces, it works correctly. The large empty space in between the header/footer seems to have thrown it off.

peterkronenberg commented 2 years ago

Yes, that is exactly my point. Is there anything I can do to improve this? I have several images like this (pages from a PDF file), that have an address at the top or a few other lines of text, with a lot of white space. But the lines of text are clearly horizontal

nguyenq commented 2 years ago

Try your question and images in SO. There are image processing experts that could help.

Or you may want to try to use Leptonica methods first; if need be, post on Leptonica site for help.

nguyenq commented 2 years ago

@peterkronenberg Any luck (better results) with Leptonica methods?

peterkronenberg commented 2 years ago

Sorry, haven't had an opportunity to try it yet.

nguyenq commented 2 years ago

Ran a test case for Lept4J:

   /**
     * Test of pixFindSkew method, of class Leptonica1.
     */
    @Test
    public void testPixFindSkew() {
        System.out.println("pixFindSkew");
        File input = new File("C:\\Temp\\samplerotatedimage-Redacted.tif");
        Pix pixs = Leptonica1.pixRead(input.getPath());
        Pix pix1pp = Leptonica1.pixConvertTo1(pixs, 128);
        FloatBuffer pangle = FloatBuffer.allocate(1);
        FloatBuffer pconf = FloatBuffer.allocate(1);
        int expResult = 0;
        int result = Leptonica1.pixFindSkew(pix1pp, pangle, pconf);
        float conf = pconf.get();
        float angle = pangle.get();
        System.out.println("Confidence: " + conf + " Angle: " + angle);
        assertEquals(expResult, result);
    }

Output:

Running net.sourceforge.lept4j.Leptonica1Test
pixFindSkew
Confidence: 2.8027375 Angle: 0.21875

Documentation: https://tpgit.github.io/Leptonica/skew_8c.html

peterkronenberg commented 2 years ago

Thanks for trying this out!