Fast implementation of Sauvola-alike methods without integral images

brandonmpetty / Doxa

A Local Adaptive Thresholding framework for image binarization written in C++, with JS and Python bindings. Implementing: Otsu, Bernsen, Niblack, Sauvola, Wolf, Gatos, NICK, Su, T.R. Singh, WAN, ISauvola, Bataineh, Chan and Shafait.

https://brandonmpetty.github.io/Doxa/WebAssembly

Creative Commons Zero v1.0 Universal

168 stars 37 forks source link

Fast implementation of Sauvola-alike methods without integral images #7

Closed chungkwong closed 3 years ago

chungkwong commented 5 years ago

Sauvola-alike methods is implemented using integral images in Doxa. The main drawback of the approach is that the two integral images are quite memory consuming. In fact, a recent paper https://arxiv.org/abs/1905.13038 pointed out a simple way to greatly reduce memory footprint, while keeping the time complexity linear and independent of window size.

brandonmpetty commented 5 years ago

Thank you, using less memory and seeing a speed boost definitely has my attention.

As of now, I am building 2 8-byte Integral Images (Mean and Std. Dev), with two 8-byte temporaries, for a total of 32xHxW. I have also implemented an alternative to that algorithm that removes the temporaries, but pays a slight performance cost.

I'll try to get it implemented within the next few weeks and let you know how it goes!

brandonmpetty commented 5 years ago

@chungkwong, again, thank you for the notice. I have, however, ran into issues trying to implement the algorithm.

There appears to be an array-out-of-bounds issue. ex: "c ← c + Cj+r − Cj−l;" - Here, j starts at 0 and we are subtracting half our windows size. This is akin to C[-14], yet the index range starts at 0 and can't go negative.

Could you clarify that?

chungkwong commented 5 years ago

All undefined elements(due to index out of bounds) of arrays are considerated zero. My implementation is like: https://github.com/chungkwong/binarizer/blob/master/src/main/java/com/github/chungkwong/binarizer/EfficientBinarizer.java

brandonmpetty commented 5 years ago

@chungkwong, truly amazing work. Here is what I am currently getting with your optimization applied to Niblack, compared to a few other algorithms using my sample image:

Otsu Speed: 0.001230
Chan-Niblack Speed: 0.007560
T.R. Singh Speed: 0.014013
Niblack Speed: 0.022445
Sauvola Speed: 0.023749

That is amazing. Not to mention the significant memory improvement! I am going to do a deeper dive on the algorithm, see how it performs when compiled with WASM, and this is probably going to be the default optimization used for all of the related algorithms in this framework. You really made my day, thanks for sharing.

brandonmpetty commented 5 years ago

@chungkwong , Thanks again for bringing this to my attention. I like the algorithm a lot. I used the same concepts in this framework to create a really fast Min/Max algorithm that I published in January. It uses an ordered set to accomplish this, while your mean / std. dev uses a column based summation strategy. It would be interesting to see what it would look like to unify the two.

It will probably take me a while, because I am doing other things, but I will apply your algorithm in this library. It will probably benefit most people. The nice thing about integral images, however, is that they are not dependent on the size of the window, thus are great if you are doing a parameter analysis on the various algorithms (almost no one is probably doing this, however). I'll leave this "issue" open until I publish the next version of this framework, but like I said, I probably wont be able to get to it until the start of next year. Thanks again.

chungkwong commented 5 years ago

I think that the two algorithms share a common idea: tracking how the quantities(min/max/sum) change as the window slide from left to right.

In case of max/min, the maximum value would not change unless the maximum of the newly encountered column is greater than the current maximum or the current maximum come from the column just dropped when the window slide, therefore a deque of ascending maximums of columns should be kept, a deque may also be kept for each column to find maximum in a column efficiently.
In case of sum, the sum of a column is added and the sum of another column is subtracted when the window slide, therefore only the sums of each column with fixed height should be recorded.

Some other statistics can also be computed using similar techniques, sliding median for example.

brandonmpetty commented 4 years ago

I have not forgotten about this work, @chungkwong . I am taking a year long sabbatical starting next year and this is one of the first things I plan on working on. I am also thankful for your repo. It pointed out one issue I have in my codebase... that I am calculating standard deviation and variance based on Sample, not Population. Not only is population variance appropriate here, it is also more performant in terms of speed. On a test image with over 70k pixels this resulted in a 1 pixel difference in the binary result... so making this change should not result in a huge shock to anyone.

brandonmpetty commented 3 years ago

@chungkwong , I have added your algorithm to most of the binarization routines in this library. It is in a new branch, V3. I am going to do more testing, fine tuning, etc and will hopefully get it merged next week.