seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
GNU Lesser General Public License v3.0
359 stars 43 forks source link

Better Algorithm Mk. II #37

Open william-silversmith opened 4 years ago

william-silversmith commented 4 years ago

Discussed this some in #6

There are several faster algorithms than Wu et al's 2005 decision tree.

image

Figure from Grana, Borghesani, Cucchiara. "FAST BLOCK BASED CONNECTED COMPONENTS LABELING". IEEE 2009. doi: 10.1109/ICIP.2009.5413731

william-silversmith commented 4 years ago

It's also worth noting that Wu's algorithm is good enough that we are in a regime where the best known algorithm probably won't gain us more than 2x performance, while parallel could gain us several times more than that.

william-silversmith commented 4 years ago

The block based approach is undermined by multi-label. It depends on being able to summarize entire regions using the knowledge that all foreground pixels will match.

william-silversmith commented 4 years ago

It might still be fun to implement block-based for binary images, which are an important class of operation.

william-silversmith commented 3 years ago

OpenCV implements Grana et al's Optimized Block Based Decision Tree for binary images (apparently by that team themselves) and I tested it out on a medical YACCLAB dataset. Their algorithm was almost twice as fast, getting 470-493 MVx/sec vs about 250 MVx/sec from mine. Pretty impressive. I had assumed that the ceiling for a good algorithm was more like 20% rather than nearly 100%.

william-silversmith commented 3 years ago

Was able to provide a better algorithm for 6-connected which makes me happy. I think the real secret of BBDT is the ability to make 1/4 fewer writes in the first pass. Will have to try that out.

william-silversmith commented 3 years ago

BBDT and approaches derived from it are not useful for multilabel, however I suspect that the pixel prediction approach is readily compatible. Create a forest of trees for each situation and use goto to select the correct tree for the next voxel. This can allow the elimination of a lot of duplicate work. 4 and 6 connected are prime candidates for this. 6-connected especially is about 2-3x worse than a null pass, so there's lots of room for improvement.

This technique is described in some detail in this paper: https://prittt.github.io/pub_files/2016acivs.pdf

For multi-label, we would use knowledge of the previous differently labeled pixel to remove pieces from the full decision tree rather than positively select information that is already known.

william-silversmith commented 2 years ago

It would be cool if I integrated some of the crazy binary algorithms from YACCLAB. That group is on a tear.