Closed sandreas closed 3 years ago
Skimming the paper it looks like it should be straight forward and has a decent number of citations. It won't make it into the up coming stable release but it could make it into the next.
Nice :-) Thank you.
Hi, I don't want to be this septic but are you sure it is this superior related to Sauvola? On the GitHub page the example is really bad. From my experience Sauvola and Otsu (slow) are extremely good with a good parameter set.
Hi, I don't want to be this septic
Its ok :-) In a discussion, nothing is septic. Especially because I'm not a professional in this topic.
are you sure it is this superior related to Sauvola
Not 100% sure, but all citations i read were superior in many cases.
On the GitHub page the example is really bad
Bad in what way? Not representative or bad result? If you think its a bad result, are you sure you did not mix it up?
original
Niblack
Sauvola
Wolf et al.
In my opinion, the fourth result is the best one for representing the ground truth...
Tangentially related, these thresholding algorithms are now threaded. About 1X faster for each core in your computer if concurrency is turned on.
Which algorithms in which version?
The latest SNAPSHOT, which is a candidate for the next stable. Just turned on threading by default. I think all thresholding algorithms have at least part of the code parallelized now.
this is great news, thank you :-)
https://perso.liris.cnrs.fr/cwolf/papers/icpr2002v.pdf
Just saving a reference to the paper. It's got enough citations and source code that it's worth taking a closer look at it
Resurrecting this ticket. Just added it to the "official" list of features to be added in this up coming release. Only been about two years, still need it @sandreas ? ;-)
@lessthanoptimal It's not that I need it, but let me ask if there is a superior thresholding algorithm implemented and how to use it? ;-)
https://github.com/lessthanoptimal/BoofCV/pull/243
Turns out this was very easy. Niblack, Sauvola, and Wolf are 99% the same code, the inner most function is different. Wolf and Sauvola look very similar unless the image is over saturated. There are also two different versions of Sauvola out there. The one in BoofCV is based on the journal article. No actual change in behavior though.
I think all of the more famous "classical" thresholding algorithms have been added. I wonder how difficult adding basic OCR would be.
I already experimented with some "basic" OCR - this should not be underestimated. The process of pure OCR may already be complex enough because of the big variety of fonts and languages out there - not talking about RTL languages, but the preprocessing steps (thresholding, deskewing, etc.) are also very important (see https://towardsdatascience.com/pre-processing-in-ocr-fc231c6035a7).
I once found a paper about line detection on geographical maps (bended lines, different colors) that looked very promising but I did not have the time to try it out with BoofCV. Unfortunately I also did not find it again nowadays...
But let me say that I would love to see a more professional BoofCV approach to OCR with LTSM or other AI techniques, if you find the time :-)
I'm thinking of something very simple to start with. Targeting 5 or so very similar blocky "industrial" fonts in a non cluttered image. Just alpha numeric English characters with clear separation. i would prefer to not use any DL approaches since those will be very slow on a CPU and are probably handled better by a dedicated library. If you know anything that would fit the bill let me know!
Hey,
i would appreciate, if you would add wolf binarization to boofcv. Seems to be a very nice little algorithm with impressive results (much better than sauvola).
Sample Code in C: https://github.com/chriswolfvision/local_adaptive_binarization
What do you think?