vipul-sharma20 / document-scanner

An OpenCV based document scanner
792 stars 110 forks source link

Modify perspective rectification to match paper aspect ratio #2

Open erickim555 opened 8 years ago

erickim555 commented 8 years ago

Currently, when the code performs the perspective rectification/correction, the code maps the original image to an [800 x 800] image (800 is hardcoded).

As a result, the resulting image looks "squashed" in the x direction.

To improve the output, one could map the image to an image size whose aspect ratio matches the paper size aspect ratio. For instance, let's assume that the form will come in the A4 paper size: http://www.papersizes.org/a-paper-sizes.htm The A4 ratio of height-to-width is 11.7 / 8.3 ~= 1.41.

So, to maintain an A4 aspect ratio when doing the perspective correction, you can map the original image to an image of size [800 x 1128](800 * 1.41 = 1128).

Since your test image is landscape, the dimensions should be: Height: 800 pixels Width: 1128 pixels It'd be neat to also autodetect paper orientation, ie portrait vs landscape.

GrayHatter commented 8 years ago

Wouldn't it be better to guess the size/ratio?

Let the top left & right be; A & B, then let the bottom left and right be C & D. ( by xy with the first top left pixel being 0x0)

Then if we assume a rectangle, the size correction would be width would be the distance of (AB + CD)/2, height would be distance of (AC + CD)/2 with those porportions, you could then scale down a copy of the image, do the edge detection, then scale up the resulting contors, and apply them to the original image. Scale + skew then crop.

erickim555 commented 8 years ago

@GrayHatter That would work only if camera taking the picture was completely "level" with the paper (ie only in-plane rotations, no out-of-plane rotations).

Because perspective effects modify lengths in a nonlinear way, doing what you suggest may result in distorted dimensions.

As an (extreme) example, consider the two photos of the same piece of paper. Perspective effects can result in your proposed approach producing incorrect aspect ratios: http://imgur.com/V7zaViC http://imgur.com/IrGV8Ck

However, for photos where the paper is captured at nearly-level, your approach is a good easy approximation for estimating the aspect ratio in a general way.

GrayHatter commented 8 years ago

Is that paper letter, or a4? If it's letter, I assume we do still go with my solution, if it's not I'll think a bit more.

But on the flip side, what would each look like if skewed to a static 800^2? Because I'm not suggesting that my solution in perfect. Just better, but that's just 5m of thought so I could just as easily be wrong.

Then, that leaves the question, do we want to define the measurements as letter/A4? if so, then we could still min/max the mean of the sides, to determine if it's portrait/landscape, and if anyone does skew the photo that much, well too bad for them, they should learn to take better pictures. ( but I'd say that's a good solution for my suggestion as well)

GrayHatter commented 8 years ago

@erickim555 also all A size paper is sqrt(2).

erickim555 commented 8 years ago

@GrayHatter The paper size that I used in those pictures are Paper Letter, ie 8.5 inches by 11 inches.

But yup, if the user captures the picture in a "sane" manner (ie nothing like the extreme example I showed), then your heuristic way of estimating aspect ratio is fine, and should handle most cases well.

To cover all bases, one could have the following system: (1) Let the user specify the paper type in advance, ie A4 vs US Letter vs etc. (2) If the user doesn't know (or is lazy): then they can choose "Auto detect", which will try its best to determine the paper dimensions using the averaging approach above.

(1) has the benefit of geometric correctness: the rectified image will be of the correct aspect ratio if the user knows the paper dimensions ahead of time. I suspect for >95% of use cases, the paper size will either be A4 or US Letter (they're both nearly the same anyways). (2) has the benefit of convenience: for most typical (ie sane) inputs, the result will be good enough. Also, this will handle any aspect ratio automatically.

GrayHatter commented 8 years ago

@erickim555 right, that ratio is spot on then. I agree completely, something like [--force-ratio=N.NNN] or [--letter[-landscape]] or [-a4[-landscape]] as an option.

GrayHatter commented 8 years ago

If you want to be really fancy, you could "detect" the ratio, knowing that A4 and Letter are super common, and if the ratio matches by a factor of X from the max(change_from_mean) then Apply the A4 or Letter ratio instead. (Meaning take the larger of AB, CD, and AC, CD, subtract the mean of each, and see if the ratio matches letter, or A4) Then you could just apply that transformation to the mean.

EDIT: err, I mean take the max, of each, and if ( Max length ration > calculated mean ratio > A4/Letter ratio)