seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
GNU Lesser General Public License v3.0
367 stars 43 forks source link

Apply mask prior to segmentation? #40

Closed atpolonsky closed 4 years ago

atpolonsky commented 4 years ago

Hi, I'm a new user of the package and was wondering if there is functionality to apply a mask before running connected components, or if masks must be applied after processing the whole volume? If possible, this could be helpful to speed up processing for large data volumes (1000s of pixels on edge) where the regions of interest are a small subset of the sample volume.

I don't know the details of your implementation, so could envision this actually slowing down the process instead of speeding it up depending on the specifics of the algorithm, especially if this means you need to keep a whole duplicate array for the mask in memory at the same time.

william-silversmith commented 4 years ago

You can try masking the source image with numpy operators before applying cc3d. e.g. img *= boolean_mask You may also find useful masking operators in this other library I wrote called fastremap. Let me know if that helps!

william-silversmith commented 4 years ago

The algorithm skips regions that are colored black (0), so masking will speed up the run time.

atpolonsky commented 4 years ago

I'm still getting a returned feature of size equal to the number of black (0 value) voxels in my volume, even after applying a mask. Is this the expected behavior?

william-silversmith commented 4 years ago

Can you show me some of your code? How are you measuring the feature and performing the mask? Have you visualized the volume?

atpolonsky commented 4 years ago

I think I figured it out, the labels coming out of cc3d.connected_components has a region 0, so what I was doing subsequently was counting this as a feature (np.unique includes 0 as a unique number in the array). I can't actually tell if the algorithm is checking black voxels in this case, but I did see a slight slowdown when I made all of the voxels non-zero and ran the code again, maybe a second or two difference in run time at most. My test dataset volume was 1500 x 500 x 800 voxels, but I have larger volumes on the order of 5000x5000x500 voxels, so its possible without masking these would take much longer. If this behavior makes sense you can mark this as closed.

william-silversmith commented 4 years ago

cc3d takes a multi-labeled image as input and counts black voxels in that image as background. Non-zero voxels are counted as foreground, though neighboring values that do not match are not counted as part of the same component. It outputs an image with the foreground voxels labeled 1 to N where there are N connected regions of foreground voxels. Background labels (black voxels) remain unchanged and are treated specially so that the innermost loop skips over them without applying the decision tree. Therefore, you should expect better performance on an image that has a lot of black voxels.

https://github.com/seung-lab/connected-components-3d/blob/1b7479c9945276107f40bfba8a6113e72fbf1b02/cc3d.hpp#L307-L309

Since the output relabeled image retains the black voxels, np.uniquewill indeed output an accounting of them. You can try using fastremap.unique for a faster implementation of that function.

https://github.com/seung-lab/fastremap/blob/master/fastremap.pyx#L705

As a 2D example:

INPUT IMAGE                OUTPUT IMAGE

0 0 4 5 6 4                0 0 1 2 3 4
0 4 0 5 6 4                0 1 0 2 3 4
0 4 0 5 6 4  ====cc3d===>  0 1 0 2 3 4
0 0 0 5 5 6                0 0 0 2 2 3
0 0 0 6 6 0                0 0 0 3 3 0

Applying np.unique(labels, return_counts=True) to the output image in the above example would return results agnostic to the connected components problem and would contain a tabulation of black pixels.

william-silversmith commented 4 years ago

Perhaps that was more verbose than you needed, but yes, I think you understand the problem correctly now and this is normal behavior.