janelia-flyem / gala

Automatic segmentation of electron microscopy volumes
BSD 3-Clause "New" or "Revised" License
76 stars 29 forks source link

Where and why to ignore 0 in VI? #96

Closed Keep-Passion closed 4 years ago

Keep-Passion commented 4 years ago

Thank you for sharing your code. And I found that the codes in skimage and cremi are also based on your code. skimage: https://github.com/scikit-image/scikit-image/blob/master/skimage/metrics/_variation_of_information.py cremi: https://github.com/cremi/cremi_python/blob/master/cremi/evaluation/voi.py But there is a tiny question: Where and why to ignore 0 in VI? For your code, gala, you ignore 0 both in gt and pred (Default) For cremi code, they only ignore 0 in gt, not in pred (Deafult) For skimage, they did not ignore 0 in gt and pred (Default) I made a simple comparison below, and i found that different setting will draw out different result. Can you tell what should i do in the task of neuron segmentation? Do i need to ignore 0 label in both gt and pred? image

jni commented 4 years ago

Hi @Keep-Passion and thanks for writing!

The part where skimage differs from the other two implementations is most concerning. I'll have to think about that carefully. I think it's a bug in skimage. If I don't ignore 0 in the prediction, I match [1, 1, 0] in pred to [1, 2, 3] in GT, and I get 2/3 1 + 1/3 0 = 2/3 for the VI. Would you mind raising an issue in the scikit-image repo? Thank you!

As to why these should be ignored, these are historical reasons — when I coded the first implementation, we had some volumes segmented with a 1-pixel wide boundary of zeros between segments, and some without, and we decided we didn't care whether the boundary-free versions landed on one side or the other, so we decided to ignore those values. In a sense, the default doesn't much matter and you should pick what makes most sense for your data. Do you care how things are segmented in the 0 label? If so, don't ignore it! You can use the 0-label to mark parts of the volume where segmentation performance is irrelevant. If it's all important, you can do +1 on your labels and the problem goes away!

Keep-Passion commented 4 years ago

@jni Thank you for your quick replying, i will raise an issue in skimage

jni commented 4 years ago

Thank you! I'll close this in the meantime since I think the issue is in skimage.