seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
GNU Lesser General Public License v3.0
356 stars 42 forks source link

Question on comparing individual lesions between two masks based on the cc3d.statistics output. #106

Open PanosProv opened 1 year ago

PanosProv commented 1 year ago

Hello,

First of all thank you for cc3d, I am very new to the field and I found it much easier to use compared to other implementations of connected components for 3D images.

I used the 'stats' function to get the voxel sizes of individual lesions and their bounding boxes from a mask, which if I understand correctly represent the position of each lesion in the mask.

What I would like to do, is to compare the number of correctly identified lesions that my model predicted in the ground truth regardless of whether their volumes match (or regardless of whether they are precisely delineated). I guess this would be done by comparing the bounding boxes between the mask my model predicted and the ground truth mask. Is there a straightforward way to do this using cc3d or another package?

Best and thank you for your time!

william-silversmith commented 1 year ago

Hi PanosProv,

I'm very glad you are getting a lot of use out of cc3d!

I would start by comparing the centroids (also provided by stats), just because it's easier to manage than with the entire bounding box. Just look for the minimum distance centroids. This won't necessarily be a perfect metric, but it'll get you started. You can use scipy to perform a distance calculation between two collections of points.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html

The real right way to make this comparison will likely depend on your data, but you can try checking that the number, size, and centroids of the regions are approximately the same. You can also compute e.g. voxel-wise F1 scores by comparing the number of false-positive, false-negative, true-positive, and true-negative voxels for each component. There may be other appropriate metrics you can use too, such as variation of information. However, voxel-wise scores can miss useful connectivity information (something my field deals with). So many years ago, we started using RAND scores, and then even more obscure metrics.

Unfortunately, cc3d only provides basic information and the rest is kind of up to you.