galfar / deskew

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.
http://galfar.vevb.net/deskew
Mozilla Public License 2.0
164 stars 26 forks source link

Have you considered using a shear instead of rotation? #57

Open gtoal opened 8 months ago

gtoal commented 8 months ago

For small enough degrees of rotation, re-aligning using a shear in X and Y may be better than a true rotation, as it will preserve a 1:1 pixel mapping rather than requiring resampling. The resulting image may exhibit a tiny change in scale but for many applications (especially OCR) that may not be relevant. Shearing should also be cheaper/faster than true rotation.

galfar commented 6 months ago

Actually, a few versions ago I switched from shears to true rotation since there were complains (https://github.com/galfar/deskew/issues/15) about results being too blurred (wasn't "nearest neighbor" shear though) etc.

If you want you can try with version v1.25 which uses the old shear-based rotations (still here in Releases). Or best attach some sample where the differences (shear with ImageMagick maybe?) are visible so I know what to look for.

gtoal commented 6 months ago

That surprises me. I would have expected the opposite of what people reported, i.e. that rotation might cause slight blurring but the shearing shouldn't, with any hard edges being preserved as no pixels were being changed in content, only in position. Anyway I didn't realise when I posted, that you had already implemented it once, so clearly you are already aware of any benefits or detriments. I will say though that I used to do my own deskewing using shearing (my corrections were always small angles) and I don't remember ever being aware of any blurring. Anyway, no need for followup. I just thought it might have been something you hadn't tried yet - I was wrong. Btw I don't know what 'nearest neighbor' shear means? A shear is a shear, I can't think of any variations. If you have a shear of <N> pixels across the width of a document, you use Bresenham's to calculate (but not actually draw) a virtual line from (left,0) to (right, N), and use the Y value at each point in the virtual line to shift that X-column vertically by that number of pixels. Similarly for the other axis if shearing in both dimensions. Were you doing something else (and if so maybe that accounts for the reported blurring?) Oh - I know! - did you shear and rescale by any chance? The rescaling - even if tiny - would account for blurring artifacts.

galfar commented 6 months ago

What I used before was Paeth's A Fast Algorithm for General Raster Rotation which uses 3 shears to achieve rotation. I implemented this for Imaging library for generic rotation of images and only later used it in Deskew. It does simple linear interpolation when moving the pixels but produces blurrier results than regular bilinear interpolation. Some more info: https://www.ocf.berkeley.edu/~fricke/projects/israel/paeth/rotation_by_shearing.html

With switching to true rotations I was able to bring in nicer bilinear, and also bicubic and Lanczos etc.

Maybe running Deskew with -q nearest might be worth a try? That won't do any resampling and should be closest to 1:1 mapping.

gtoal commented 6 months ago

Ah yes. I'd seen Paeth's result in the context of specifying ellipses in terms of shears instead of rotations but hadn't realised it was usable for arbitrary raster rotations. It does explain the rescaling/interpolation. When I was using only two shears to do small rotations I was explicitly accepting the fact that a small rescaling would have been necessary to match the dimensions exactly but was being skipped since the factor was less than .1% ... So the -q nearest just shears but doesn't rescale? That plus the user-specified angle parameter you mentioned in the other reply sound like together they will do exactly what I need! Thank you.

Graham