Open Piolie opened 5 years ago
I don't need that implememtation, as ST already have the connected componets labeling implementation and uses it internally.
I'll just add an option to despeckle named threshold: all the components with size lower than the threshold value will be removed no matter where they are placed.
Not sure if the implementation would also allow for the following, which I would also find useful in this context:
Remove components thinner than a certain number of pixels. so e.g. a hair could be removed even if it produce a long structure and covers more pixels than a printed dot, as long as it is thinner than any printed line.
As I said, don't know if the maths for it is already implemented, but it could be done based on number of pixels within the structure per pixel on the edge (1 for a single-pixel line, 2 for 2-pixel lines, etc.), or on distance of "inner" pixels from the edge. Or maybe there's a smarter algorithm in either ImageMagick or ImageJ.
I already use the algorithm you described in the noise reduction of the color segmentation for removing long thin components. Yes, I think of implementing the new option in this way.
The current despeckle algorithm works well most of the time. However I have seen that it fails even for tiny particles if they are very near the rest of the content (for example, in between text lines). Rising the
Despeckle
level does not improve the result. On the contrary, the algorithm starts eating away the dots over the lettersi
or the full stops.I think it would be nice to have the option to erase all black/white areas that have a pixel count bellow a settable threshold.
Currently this can be achieved by applying ImageMagick's connected-component labeling on the output of ScanTailor. The license is compatible with the GPL, so maybe it is easy to implement here.