Open RohitRastogi94 opened 4 years ago
@RohitRastogi94 If I understand correctly, the document contains rectangles. So the package is detecting the rectangle inside of the document, instead of the document itself?
Is this on iOS, Android, or both?
@humphreyja Yes, Document contains multiple rectangles. currently I am only trying in Android.
@humphreyja any update on this one?
@RohitRastogi94 Ok I've ran into this before too. I think the algorithm should be looking for the largest rectangle in the image, instead of the smallest.
You can try to remove this conditional and see if that helps. If so, I'll probably update the package with this change. https://github.com/HarvestProfit/react-native-rectangle-scanner/blob/master/android/src/main/java/com/rectanglescanner/helpers/ImageProcessor.java#L167
On ios it works pretty well, recognizes rectangles fast, on other hand on android it barely recognizes the same items.
@glodzl I've been experimenting with this a little bit. The main difference between this is camera quality as well as software quality. On iOS, Apple has given us an API to use that finds rectangles in the image. On Android, we have to do all of that ourselves using OpenCV (this is why the package is so large). It involves applying filters to the image and finding all the borders in the image. It's really interesting stuff. But the outcome is a bit finicky sometimes. As this code is the same as all the other open source android scanners out there, I'd love to see how I can improve it. This basically just comes down to 2 potential changes:
The above pull request looks at the last one. For the first one, I'm thinking it need to determine a "brightness" level of the image and adjust the contrast to fit that brightness level before matching rectangles. Either that, or we generate a list of potential rectangles from a low light image and a high light image before selecting the best one (usually the largest rectangle).
Given our quarantine status here in the U.S., I'll probably be looking into this more over the next few days because there's nothing better to do 😂
Hi @humphreyja I'm trying to find a solution for this issue now. Have you found anything?
I gave a try to Genius Scan and the edge detection works perfect. Hopefully we'll get something like that
@cagv24 I've been experimenting with pr #33. So far, I've added a better way in increase the contrast of the image before detection. But I've found that that doesn't work either. The problem is the findContours
function. With the canny
image filter, it needs a solid boundary around the entire rectangle in order for it to detect it.
I've been trying to figure out how to use the hough lines
functionality with openCV as next steps. Hough lines will fill in the gaps and do a much more accurate detection though, I'm not sure how to select which lines map to the rectangle yet. I assume this is how Grizzle Labs/Genius Scan is doing it.
@humphreyja I'm checking right now this question/answer on StackOverflow opencv-android-create-new-image-using-the-edges-of-the-largest-contour , that guy says he can find the biggest rectangle with that implementation (maybe we should take a look).
Notes:
He applies the canny filter before the gaussian blur.
And the RETR_TYPE
is RETR_LIST
instead of RETR_TREE
I'm doing some tests with that approach and for now, it seems good.
@cagv24 Canny before the blur is actually not a bad idea haha. That could help fill in the missing holes. I didn't have much success with changing the RETR_TYPE
flag. Tree will store the hierarchy of the contours (which maybe we don't need). Let me know how your tests go. I feel that swapping just the blur around may be a good incremental improvement to this without needing to implement any more complex algorithms.
After working with a friend (@Luifersuarez) we could figure out this problem. The filters we apply now are the followings:
Imgproc.cvtColor(resizedImage, grayImage, Imgproc.COLOR_RGBA2GRAY, 4);
Imgproc.GaussianBlur(grayImage, grayImage, new Size(5, 5), 0)
Imgproc.threshold(grayImage, grayImage,127, 255, Imgproc.THRESH_BINARY|Imgproc.THRESH_OTSU)
The THRESH__OTSU is what did the trick.
Then in the findContours method we used the Retrieval Type RETR_EXTERNAL
And the last step is the algorithm to find the biggest area (it's an extra step maybe is no necessary because the filters do very well their job so there is only one contour). This is the code:
MatOfPoint biggestCountour = null;
Double maxArea = 0.0;
for (MatOfPoint contour: contours) {
Double area = Imgproc.contourArea(contour);
if (area > size.area() / 5) { //This is useful to make user taking photos closer
MatOfPoint2f contour2f = new MatOfPoint2f(contour.toArray());
Double peri = Imgproc.arcLength(contour2f, true);
MatOfPoint2f approx = new MatOfPoint2f();
Imgproc.approxPolyDP(contour2f, approx, 0.02 * peri, true);
if (area > maxArea && approx.total() == 4) {
biggestCountour = contour;
maxArea = area;
}
}
}
if (biggestCountour != null) {
Point[] points = biggestCountour.toArray();
Point[] foundPoints = sortPoints(points);
return new Quadrilateral(biggestCountour, foundPoints, new Size(srcSize.width, srcSize.height));
}
@humphreyja could you please take a look at this solution? I already tested it with different invoices and receipts in different devices (with different android version)
Update: I just realized this approach only works for documents like receipts and invoices (rectangles with white borders or similar colors). I'm gonna continue working on this... 😫
Update 2: We solved this problem converting the image to HSV instead of gray, then we use the saturation channel and finally THRESH_BINARY_INV
as follow:
Imgproc.cvtColor(resizedImage, grayImage, Imgproc.COLOR_RGB2HSV, 4);
List <Mat> image = new ArrayList<>(3);
Core.split(grayImage, image);
Mat saturationChannel = image.get(1);
Imgproc.GaussianBlur(saturationChannel, grayImage, new Size(5, 5), 0);
Imgproc.threshold(grayImage, grayImage,127, 255, Imgproc.THRESH_BINARY_INV|Imgproc.THRESH_OTSU);
@cagv24 Awesome work on this! I'm going to need a few days before I can demo this and create a PR, I'm hoping to get this out in the next release (I've been shooting for a new release every 2 weeks or so).
One question, what is the background for detection? Wood table tops can sometimes mess this up due to the grain in the wood for example.
I tested it with a red, blue and black background. The only background we should avoid is a white one (or similar) maybe we can add some code to handle these cases. Can you share a photo of that particular color (table) so I can test it?
Btw I can share you the python Notebook so we can test everything faster if you want
Here's an example with a receipt on it. Yeah I'd be interested in that. I've been thinking about better ways to test this, so having some tools to make sure we are actually improving the detection is important.
For this image, in particular, I had to apply an erosion with a 9x9 kernel to delete those little black lines in the table. This is the result
However I don't know if we should worry about backgrounds like this.
@humphreyja Hi man, I hope you're well. I just want to ask you where I could send you the link to the google collab notebook?
@humphreyja, I've just sent you the invitation on google drive. Please let me know if you can access the folder.
Yep I got it! *causally just going to remove email address from feed here haha
Hey guys, I'm so glad I found this library – thanks for the great work, @humphreyja! And thanks for this great contribution @cagv24!
Any chances this is gonna be merged/released soon? Is there anything I can help with?
Hi @buzinas, I've found some issues related to the image background, for instance, a document over a glass and other irregular backgrounds. Dropbox has written a blog where they talk about how to implement edge detection. They are using the Hough Transform instead of Canny, as @humphreyja mentioned before.
This is the blog Fast and Accurate Document Detection for Scanning
I tried Dropbox's app and I think it works very well.
What we can try now is to implement this approach and see how it goes.
Yeah, Dropbox is by far the best document scanner I've ever played with. When I was looking for scanners a few months ago, I read all their blog posts. That said, aren't the changes suggested by you any better than the one we have on master
right now? If so, it would still be worth it to make the change while we don't come up with something even better. What are your thoughts?
For document scanning related tasks I think it's better, however, this is not a robust solution yet. let's wait for @humphreyja to check that approach. You could modify the library as well including those changes on the ImageProcessor.java file and test them.
@buzinas I think you are right that it is a bit better. With my experimentation in pr #33 has shown that without something like the Hough Transform, the canny based algorithms all have their limitations in certain conditions. So I believe that the hough transform is most likely the best route to go. Seeing as there is a bit of a time element with this, I think it'd be better use of all of our efforts to implement the hough transform algorithm and call it good, rather than needing to come back to this later to implement the better solution. That's really why I haven't merged anything in yet.
Thanks for the answers, guys! I'm currently finishing other parts of the app I'm building, but I'll try to help making this right when I get to this.
Hi, guys. How is it going with the Hough Transform approach?
Hey, Guys any updates on the approach? Thanks a lot for this amazing library.
Hi, I'm sorry I've been so busy with other projects, so I couldn't work on this.
I have as well. Although, my current project will eventually intersect with this package at which time I will definitely be looking to improve on this. I can't give a date as to when that will be. The current implementation works pretty well but definitely not the best it could be! (Also, part of me really hopes Android will just release a new API similar to Apple for this. It would make everything waaaay easier).
Hi, I actually don't need rectangle rectangle detection. But i needed black and white and greyscale filters with camera and could not find any library for it except for this one. But this library does not have focusing feature. Also i do not need rectangle detection. opencCv is increasing my apk size too. can u guys please suggest any other library or anything else i could do? Please
@mohity777 yeah this package only does auto focus at the moment. We've talked about adding the ability to handle custom focusing on touch at some point. As for your concern with rectangle detection. You could simply clone this package and just remove the rectangle detection controller. I built the class/subclass structure so that all of the rectangle specific stuff is handled in that subclass and can be removed or changed to work for another package.
Any updates?
Is it possible to perform multiple rectangle detection (there are multiple rectangles in the same image)?
When document itself is having two rectangle, Detecting only one and cropping that only. I want to detect full document.