galfar / deskew

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.
http://galfar.vevb.net/deskew
Mozilla Public License 2.0
165 stars 26 forks source link

Detect skew angle only (no rotation done) #5

Closed galfar closed 5 years ago

galfar commented 7 years ago

Original report by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


As requested on blog:

Can you explain how to simply find the angle but not rotate using this tool? Since I’m dealing with archival TIFFs I need to keep the DPI and embedded metadata in place, so I’m thinking I would use ImageMagick to rotate once I have the angle. Thanks.

Answer:

For now you could use -l parameter: -l angle: Skip deskewing step if skew angle is smaller And use some large threshold so rotation will always be skipped.

$deskew -l 80 Sken003.png
...
Preparing input image (Sken003.png) ...
Calculating skew angle...
Skew angle found: 0.23
Skipping deskewing step, skew angle lower than threshold of 80.00
Done!

For next version I plan to modify this: angle is optional and if omitted rotation is always skipped.

galfar commented 7 years ago

Original comment by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


We must distinguish between these cases:

  1. Detection only - no output file created
  2. Detected small angle - no rotation done but output file created in requested format (see Issue #4).
galfar commented 6 years ago

Original comment by Miguel Medalha (Bitbucket: Medalha, GitHub: Medalha).


I discovered that images rotated by ImageMagick's "convert" or "mogrify", using the angle provided by Deskew, look a bit sharper than those rotated by Deskew itself, probably due to differences in processing methods (calculating the redistribution of pixels) between the imaging libraries used. This is important because any rotation which is not at 90 degree multiples always somewhat softens the image, degrading it more or less. Obviously, less is preferable.

Deskew is specially good at detecting the skew angle of an image. It would be nice being able to get the skew angle provided by Deskew and use it to rotate the image with any other program. To experiment with this, I came up with a "quick and dirty" solution for now. I went to the source file "MainUnit.pas" and made a few modifications which enable me to get as the only output the skew angle, already inverted (multiplication by -1) to provide the rotation needed, not the skew angle itself. This value is then directly passed to "magick mogrify -rotate" as a parameter, through an environment variable. Mogrify rotates the image and rewrites it while respecting the embedded metadata.

In my view, Deskew would benefit greatly from being able to satisfy what I suppose are the two main cases: those who want to use all its functionality as is, producing the final images, and those who would like to integrate it in a broader workflow.

Relating to what I just exposed (skipping for now other needs such as the type of compression, etc.) I propose the creation of the following new command line switches:

-g for "getting" as output only the needed rotation angle (the inverse of the skew angle).

-q for "quiet" (Deskew's output is a little bit "chatty" for integration into an automated workflow).

Of course these letters are only examples, albeit logic ones

(In the case of quiet operation, a log option would also be desirable?).

Thank you again. Regards.

galfar commented 6 years ago

Original comment by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


Good idea, something like this will be in!

galfar commented 5 years ago

Original comment by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


As requested on blog:

Can you explain how to simply find the angle but not rotate using this tool? Since I’m dealing with archival TIFFs I need to keep the DPI and embedded metadata in place, so I’m thinking I would use ImageMagick to rotate once I have the angle. Thanks.

Answer:

For now you could use -l parameter: -l angle: Skip deskewing step if skew angle is smaller And use some large threshold so rotation will always be skipped.

$deskew -l 80 Sken003.png
...
Preparing input image (Sken003.png) ...
Calculating skew angle...
Skew angle found: 0.23
Skipping deskewing step, skew angle lower than threshold of 80.00
Done!

For next version I plan to modify this: angle is optional and if omitted rotation is always skipped.

galfar commented 5 years ago

Original comment by Marek Mauder (Bitbucket: galfar, GitHub: galfar).


galfar commented 5 years ago

Original comment by Miguel Medalha (Bitbucket: Medalha, GitHub: Medalha).


I appreciate your effort in introducing this feature. Nevertheless, as it is it cannot be used to integrate a wider workflow. What would be needed, as I explained above, is a simple output of the rotation angle needed to deskew the image, i.e. in MainUnit.pas remove all output except the following:

WriteLn(SkewAngle * -1:4:3);

This could be obtained by a new flag under the -g parameter.

In this way, the program’s output can be passed to any other program or file, giving your program increased versatility.

Thank you again.

Miguel

galfar commented 5 years ago

Original comment by Miguel Medalha (Bitbucket: Medalha, GitHub: Medalha).


Here’s an example of a workflow I used with ImageMagick, after doing the aforementioned modification to MainUnit.pas:

for /F %%r in ('deskew -g d "%ClientPath%%JobDir%%JobName%-%PageNum%.tif.tmp.tif"') do set Rotation=%%r
magick mogrify ^
-background OrangeRed ^
-rotate %Rotation% ^
-units PixelsPerInch ^
-resample 300x300 ^
+repage ^
"%ClientPath%%JobDir%%JobName%-%PageNum%.tif.tmp.tif"