HarvestProfit / react-native-rectangle-scanner

React Native Document/Rectangle Scanner
MIT License
196 stars 62 forks source link

Android rectangle detection #29

Open RohitRastogi94 opened 4 years ago

RohitRastogi94 commented 4 years ago

When document itself is having two rectangle, Detecting only one and cropping that only. I want to detect full document.

humphreyja commented 4 years ago

@RohitRastogi94 If I understand correctly, the document contains rectangles. So the package is detecting the rectangle inside of the document, instead of the document itself?

Is this on iOS, Android, or both?

RohitRastogi94 commented 4 years ago

@humphreyja Yes, Document contains multiple rectangles. currently I am only trying in Android.

RohitRastogi94 commented 4 years ago

@humphreyja any update on this one?

humphreyja commented 4 years ago

@RohitRastogi94 Ok I've ran into this before too. I think the algorithm should be looking for the largest rectangle in the image, instead of the smallest.

You can try to remove this conditional and see if that helps. If so, I'll probably update the package with this change. https://github.com/HarvestProfit/react-native-rectangle-scanner/blob/master/android/src/main/java/com/rectanglescanner/helpers/ImageProcessor.java#L167

glodzl commented 4 years ago

On ios it works pretty well, recognizes rectangles fast, on other hand on android it barely recognizes the same items.

humphreyja commented 4 years ago

@glodzl I've been experimenting with this a little bit. The main difference between this is camera quality as well as software quality. On iOS, Apple has given us an API to use that finds rectangles in the image. On Android, we have to do all of that ourselves using OpenCV (this is why the package is so large). It involves applying filters to the image and finding all the borders in the image. It's really interesting stuff. But the outcome is a bit finicky sometimes. As this code is the same as all the other open source android scanners out there, I'd love to see how I can improve it. This basically just comes down to 2 potential changes:

  1. Changing the filters on the image to get better edge detection
  2. Changing the function that selects the correct rectangle out of the list that was detected in the image.

The above pull request looks at the last one. For the first one, I'm thinking it need to determine a "brightness" level of the image and adjust the contrast to fit that brightness level before matching rectangles. Either that, or we generate a list of potential rectangles from a low light image and a high light image before selecting the best one (usually the largest rectangle).

Given our quarantine status here in the U.S., I'll probably be looking into this more over the next few days because there's nothing better to do 😂

cagv24 commented 4 years ago

Hi @humphreyja I'm trying to find a solution for this issue now. Have you found anything?

I gave a try to Genius Scan and the edge detection works perfect. Hopefully we'll get something like that

humphreyja commented 4 years ago

@cagv24 I've been experimenting with pr #33. So far, I've added a better way in increase the contrast of the image before detection. But I've found that that doesn't work either. The problem is the findContours function. With the canny image filter, it needs a solid boundary around the entire rectangle in order for it to detect it.

I've been trying to figure out how to use the hough lines functionality with openCV as next steps. Hough lines will fill in the gaps and do a much more accurate detection though, I'm not sure how to select which lines map to the rectangle yet. I assume this is how Grizzle Labs/Genius Scan is doing it.

cagv24 commented 4 years ago

@humphreyja I'm checking right now this question/answer on StackOverflow opencv-android-create-new-image-using-the-edges-of-the-largest-contour , that guy says he can find the biggest rectangle with that implementation (maybe we should take a look).

Notes:

He applies the canny filter before the gaussian blur. And the RETR_TYPE is RETR_LIST instead of RETR_TREE

I'm doing some tests with that approach and for now, it seems good.

humphreyja commented 4 years ago

@cagv24 Canny before the blur is actually not a bad idea haha. That could help fill in the missing holes. I didn't have much success with changing the RETR_TYPE flag. Tree will store the hierarchy of the contours (which maybe we don't need). Let me know how your tests go. I feel that swapping just the blur around may be a good incremental improvement to this without needing to implement any more complex algorithms.

cagv24 commented 4 years ago

After working with a friend (@Luifersuarez) we could figure out this problem. The filters we apply now are the followings:

Imgproc.cvtColor(resizedImage, grayImage, Imgproc.COLOR_RGBA2GRAY, 4); Imgproc.GaussianBlur(grayImage, grayImage, new Size(5, 5), 0) Imgproc.threshold(grayImage, grayImage,127, 255, Imgproc.THRESH_BINARY|Imgproc.THRESH_OTSU)

The THRESH__OTSU is what did the trick.

Then in the findContours method we used the Retrieval Type RETR_EXTERNAL

And the last step is the algorithm to find the biggest area (it's an extra step maybe is no necessary because the filters do very well their job so there is only one contour). This is the code:

        MatOfPoint  biggestCountour = null;
        Double maxArea = 0.0;
        for (MatOfPoint contour: contours) {
            Double area = Imgproc.contourArea(contour);
            if (area > size.area() / 5) { //This is useful to make user taking photos closer
                MatOfPoint2f contour2f = new MatOfPoint2f(contour.toArray());
                Double peri = Imgproc.arcLength(contour2f, true);
                MatOfPoint2f approx = new MatOfPoint2f();
                Imgproc.approxPolyDP(contour2f, approx, 0.02 * peri, true);
                if (area > maxArea && approx.total() == 4) {
                    biggestCountour = contour;
                    maxArea = area;
                }
            }
        }

        if (biggestCountour != null) {
            Point[] points = biggestCountour.toArray();

            Point[] foundPoints = sortPoints(points);
            return new Quadrilateral(biggestCountour, foundPoints, new Size(srcSize.width, srcSize.height));
        }

@humphreyja could you please take a look at this solution? I already tested it with different invoices and receipts in different devices (with different android version)

Update: I just realized this approach only works for documents like receipts and invoices (rectangles with white borders or similar colors). I'm gonna continue working on this... 😫

Update 2: We solved this problem converting the image to HSV instead of gray, then we use the saturation channel and finally THRESH_BINARY_INVas follow:

Imgproc.cvtColor(resizedImage, grayImage, Imgproc.COLOR_RGB2HSV, 4);
List <Mat> image = new ArrayList<>(3);
Core.split(grayImage, image);
Mat saturationChannel = image.get(1);
Imgproc.GaussianBlur(saturationChannel, grayImage, new Size(5, 5), 0);
Imgproc.threshold(grayImage, grayImage,127, 255, Imgproc.THRESH_BINARY_INV|Imgproc.THRESH_OTSU);
humphreyja commented 4 years ago

@cagv24 Awesome work on this! I'm going to need a few days before I can demo this and create a PR, I'm hoping to get this out in the next release (I've been shooting for a new release every 2 weeks or so).

One question, what is the background for detection? Wood table tops can sometimes mess this up due to the grain in the wood for example.

cagv24 commented 4 years ago

I tested it with a red, blue and black background. The only background we should avoid is a white one (or similar) maybe we can add some code to handle these cases. Can you share a photo of that particular color (table) so I can test it?

Btw I can share you the python Notebook so we can test everything faster if you want

humphreyja commented 4 years ago

Here's an example with a receipt on it. Yeah I'd be interested in that. I've been thinking about better ways to test this, so having some tools to make sure we are actually improving the detection is important. Image 2

cagv24 commented 4 years ago

For this image, in particular, I had to apply an erosion with a 9x9 kernel to delete those little black lines in the table. This is the result

image

However I don't know if we should worry about backgrounds like this.

cagv24 commented 4 years ago

@humphreyja Hi man, I hope you're well. I just want to ask you where I could send you the link to the google collab notebook?

cagv24 commented 4 years ago

@humphreyja, I've just sent you the invitation on google drive. Please let me know if you can access the folder.

humphreyja commented 4 years ago

Yep I got it! *causally just going to remove email address from feed here haha

buzinas commented 4 years ago

Hey guys, I'm so glad I found this library – thanks for the great work, @humphreyja! And thanks for this great contribution @cagv24!

Any chances this is gonna be merged/released soon? Is there anything I can help with?

cagv24 commented 4 years ago

Hi @buzinas, I've found some issues related to the image background, for instance, a document over a glass and other irregular backgrounds. Dropbox has written a blog where they talk about how to implement edge detection. They are using the Hough Transform instead of Canny, as @humphreyja mentioned before.

This is the blog Fast and Accurate Document Detection for Scanning

I tried Dropbox's app and I think it works very well.

What we can try now is to implement this approach and see how it goes.

buzinas commented 4 years ago

Yeah, Dropbox is by far the best document scanner I've ever played with. When I was looking for scanners a few months ago, I read all their blog posts. That said, aren't the changes suggested by you any better than the one we have on master right now? If so, it would still be worth it to make the change while we don't come up with something even better. What are your thoughts?

cagv24 commented 4 years ago

For document scanning related tasks I think it's better, however, this is not a robust solution yet. let's wait for @humphreyja to check that approach. You could modify the library as well including those changes on the ImageProcessor.java file and test them.

humphreyja commented 4 years ago

@buzinas I think you are right that it is a bit better. With my experimentation in pr #33 has shown that without something like the Hough Transform, the canny based algorithms all have their limitations in certain conditions. So I believe that the hough transform is most likely the best route to go. Seeing as there is a bit of a time element with this, I think it'd be better use of all of our efforts to implement the hough transform algorithm and call it good, rather than needing to come back to this later to implement the better solution. That's really why I haven't merged anything in yet.

buzinas commented 4 years ago

Thanks for the answers, guys! I'm currently finishing other parts of the app I'm building, but I'll try to help making this right when I get to this.

ahmedsu commented 4 years ago

Hi, guys. How is it going with the Hough Transform approach?

pratyaksh123 commented 4 years ago

Hey, Guys any updates on the approach? Thanks a lot for this amazing library.

cagv24 commented 4 years ago

Hi, I'm sorry I've been so busy with other projects, so I couldn't work on this.

humphreyja commented 4 years ago

I have as well. Although, my current project will eventually intersect with this package at which time I will definitely be looking to improve on this. I can't give a date as to when that will be. The current implementation works pretty well but definitely not the best it could be! (Also, part of me really hopes Android will just release a new API similar to Apple for this. It would make everything waaaay easier).

mohity777 commented 3 years ago

Hi, I actually don't need rectangle rectangle detection. But i needed black and white and greyscale filters with camera and could not find any library for it except for this one. But this library does not have focusing feature. Also i do not need rectangle detection. opencCv is increasing my apk size too. can u guys please suggest any other library or anything else i could do? Please

humphreyja commented 3 years ago

@mohity777 yeah this package only does auto focus at the moment. We've talked about adding the ability to handle custom focusing on touch at some point. As for your concern with rectangle detection. You could simply clone this package and just remove the rectangle detection controller. I built the class/subclass structure so that all of the rectangle specific stuff is handled in that subclass and can be removed or changed to work for another package.

erennyuksell commented 3 years ago

Any updates?

stephenjen commented 3 years ago

Is it possible to perform multiple rectangle detection (there are multiple rectangles in the same image)?