mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
750 stars 131 forks source link

Augmentation may distort too much #482

Closed notiho closed 1 year ago

notiho commented 1 year ago

I just added a few lines to also log images using tensorboard, and I noticed that when using augmentation, sometimes the lines get rotated so much that most content gets lost towards the ends. I assume that this is too much to make sense? I have attached an example. I suspect that this is caused by ElasticTransform which also does an affine transformation. 皇帝陞壇上香行禮初獻奏時豐之章職事官獻帛爵

bencomp commented 1 year ago

Off-topic: I would like to see how you logged images to Tensorboard. Could you share the code for this?

More on-topic: it looks indeed like this much distortion is too much. I would think that to train a model to recognise text, it should be able to 'see' the text.

notiho commented 1 year ago

Sure, I made a pull request #485 in case this is useful for other people

mittagessen commented 1 year ago

I'm fairly sure we dialed in the values when we rewrote the augmentation a while ago but there's also a difference in the scale of the rotation between the different datasets as I just saw. What kind of training data are you using? In the line strip format the rotation can be quite high resulting in data like yours but otherwise it is fairly low at max 3 deg.

notiho commented 1 year ago

I was using arrow training data, so that should not have been the cause. I am also currently looking into this, I will probably to a PR later

mittagessen commented 1 year ago

OK, thanks. I've got a bunch on my plate so not having to track down bugs like this really helps.

notiho commented 1 year ago

I just made #489

mittagessen commented 1 year ago

Thanks, it's been merged.

colibrisson commented 1 year ago

@notiho I still have a few images that are too distorted:

image

notiho commented 1 year ago

You are right, it looks like that for longer lines, a rotation of 3 degrees around the centre as is currently possible is already too much.