davide710 / scanner

Your pocket scanner: effortless A4 document capture and transformation into a clean PDF

https://onlinescanner.pythonanywhere.com/

MIT License

7 stars 6 forks source link

get_contours enhancement #1

Open davide710 opened 9 months ago

davide710 commented 9 months ago

get_contours is the core function, it is the one that detects the 4 corners of the document. Unfortunately it doesn't work all the times. I think the preprocessing of the image before the call to cv2.getContours should be improved: as you can see in examples/ or trying the code yourself, the document is often not detected at all, or not detected right. If you have any ideas, please write it in a comment or submit a pull request. Work just on scanner.py, the code is small but if you can't understand some parts don't hesitate to ask questions.

EDIT: In the last commits I introduced the function white(), which improves the performances drastically. However, I think there's still room for improvement.

RUTUPARNk commented 7 months ago

Instead of using a fixed threshold for edge detection (cv2.Canny), you can try adaptive thresholding (cv2.adaptiveThreshold). Adaptive thresholding calculates different thresholds for different regions of the image, which might be more robust. def get_contours(img_gray): image = cv2.dilate(cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2), None, 1)

davide710 commented 7 months ago

Hello @RUTUPARNk , and thank you for your comment. I have just tried your suggestion, and it didn't work well on the test images I tried it on. But then I suspected cv2.adaptiveThreshold could do a good job later in the script (i.e. line 123), and in fact I got pretty good results, some better and some worse than the ones I had before (I have just committed them so you can see for yourself). If you are interested you can also check issue #2 , which is the one about the thresholding phase (so that the related discussion is there).

Testspieler09 commented 4 months ago

Hi @davide710 would like to work on this in a bit. I wondered if it would be easier to remove the black edge (if you mean that by not detected correctly) on the sites of the PDF by postprocessing it, instead of preprocessing the image.

What is the main aim of the script? Is it supposed to be fully automated, or could the user get some interactive prompts or options to customize its behavior?

Thanks in advance

davide710 commented 4 months ago

What I mean is that in some cases, when executing the script, cv2.findContours doesn't return the right contour because of bad lighting, shadows and get_contours returns 'No document detected!', meaning there will be no pdf to postprocess. The function white was a significant improvement because it helps highlighting the edges of the document and makes it easier to detect it. The aim of the script is to automate as much as possible, leaving the most desperate cases to doc_scan_click.py, that lets the user select the 4 corners of the document.

Testspieler09 commented 4 months ago

So what I pointed out is another issue then, right? The PDF is supposed to be ink saving so black borders are "bad". [Will work on both issues]

PS: I meant postprocessing the image not the PDF

davide710 commented 4 months ago

Yes, it's another (less important in my opinion but still) a problem, and if you could manage to make some progress it would be great. Please upload any additional test image you use even if the script fails on them. Thanks

Testspieler09 commented 3 months ago

I currently work on the mentioned issues (get_threshold and get_contours). I wanted to ask if you have any images that do not get detected but should be detected. I had one and it works after my changes, but I want to see if that is the case for all of them.

Maybe it would also be better practice to see which side of the rectangle is longer so we know the orientation. (perspective should not change the length that much as it would lead to the text being unreadable). But what would you say to that? I mention that as one image had a different orientation and got stretched.

davide710 commented 3 months ago

Hello, I just uploaded an image that is currently undetected (all the others I have now are). It is a bit tricky because the background partially reflects some light. Let me know what your results are (maybe I can help). I originally limited the code to vertical images to simplify and then never actually cared to change it. I agree it would be better as you say.

Testspieler09 commented 3 months ago

Just to give an update.

The program now scanns all example images and the text is readable (using adaptive thresholding). I am now going to work on the orientation thing mentioned and other small problems I can find. Untill then 👋

if you want to view the changes here they are