Class label mismatch issue (the 127 classes)

lin-tianyu commented 1 month ago

Describe the phenomenon

VISTA codebase and the paper says that VISTA3D supports 127 classes;
- In the infer.py of VISTA codebase, the EVERYTHING_PROMPT has a total of 118 classes.
- In the label_dict.json file of VISTA codebase defines 132 classes.
In the labels.json of the MONAI wrapper of VISTA3D defines 135 classes.

Questions

Why are 15 classes ignored in infer.py of VISTA codebase?
Which label-to-class list is right if I want to infer all 127 classes of a CT scan? Or which codespace should I use? (VISTA codebase / MONAI wrapper)

Thank you!

lin-tianyu commented 1 month ago

Possible Solution Maybe the right way is to:

use the VISTA codebase
modify the IGNORE_PROMPT in infer.py to only ignore the 5 deprecated classes.

So that we will have 132 (defined in the label_dict.json) - 5 = 127 classes, is that right?

heyufan1995 commented 1 month ago

Hi: Thanks for pointing out and sorry for the confusion. I will update the readme. Initially we included all 132 classes for training then we found there are some conflicts between classes and the training data for some class is insufficient so we removed part of training sets. Those 5 are not supported. "16, # prostate or uterus" since we already have "prostate" class,
"18, # rectum", insufficient data or dataset excluded. "130, # liver tumor" already have hepatic tumor. "129, # kidney mass" insufficient data or dataset excluded. "131, # vertebrae L6", insufficient data or dataset excluded..

So we say VISTA3D support 132-5 = 127 classes.
In the model-zoo labels.json, ignore that 135 classes since we included some additional data for testing purposes. We will remove those.
For everything-prompt, we removed lesions and vessels that will have overlap with other organs. https://github.com/Project-MONAI/VISTA/blob/aeff47e699be436715894fccc0f54444134607a0/vista3d/scripts/infer.py#L37C1-L55C12. Those are the removed prompts. The reason we remove tumors and vessels because they overlap with larger organs and make the output noisy (airway overlaps with trache). The same reason for "lung" "bone" "kidney" since we already have 5 lung lobes, 20+ bone-substructure and "left" "right" kidney. DO NOT USE "lung" "bone" "kidney" prompts, use their substructures (e.g. use "left kidney" + "right kidney" for "kidney" segmentation.
I won't suggest segment 127 classes directly since they have overlaps and the results will be overwritten by overlaped classes. So
1. Do 117 (exclude background) class everything segmention, through the model-zoo. So they don't have overlap
2. Do 3 class "bone", "lung", "kidney" through model-zoo. https://github.com/Project-MONAI/model-zoo/blob/6bdfd30b63b3d1a799c80f1d17a783ff3a66c66c/models/vista3d/configs/inference.json#L23. This code actually convert "lung" prompt into 5 lung lobes, "kidney" into "left" and "right" and "bone" into sub-bones.
3. Do 7 "lung, pancreas, hepatic, colon, bone" tumor and "hepatic vessel" and "airway" joinly or separatly.

This cumbersome solution is just trying to mitigate the overlapping class issue. A better solution might just output 127 separate binary masks. I will update readme asap and thanks for pointing them out. Thanks, Yufan.

lin-tianyu commented 1 month ago

Thanks for your patient response! This is really helpful to my project.

lin-tianyu commented 1 month ago

Hi @heyufan1995 , Thanks again for your response. However, I have another issue.

In your answer, you suggest "ii. Do 3 class "bone", "lung", "kidney" through model-zoo." It turns out that if I use label_prompt=[2, 20, 21], the segmentation results will only contains their subclass counterparts (eg. segmentation results using label prompt 2 will only have class 5 and 14) because of the subclass setting. I can understand this part.

But if I just trying to get label 2/20/21 in the output with label prompt 2/20/21, and simply changed the subclass setting to "subclass": {'2': 2, '20': 20, '21': 21}, the results seems to collapse.

So I'm wondering if this phenomenon is normal, or I have something missing here. Hope to have your response.

Thanks! Tianyu

heyufan1995 commented 1 month ago

Hi Tianyu: This phenomenon is normal, we basically disallow direct usage of "2", "20" and "21" and additional 5 that we excluded ("16, # prostate or uterus" e.t.c.) . They are random-embeddings and are not or not well trained.

{'2': 2, '20': 20, '21': 21} means no mapping at all. We only have a few annotations of "lung" "kidney" "bone", most of the training data are their substructures. Those 3 class_prompt are basically not trained during training, and thus we just merge their sub-structure for them. For example, if you directly use "kidney" prompt, this prompt is just a random 1x512 random class_embedding with no relationships with "left" and "right" kidney prompts. That's why we have this mapping.

lin-tianyu commented 1 month ago

Understood! Thanks for your detailed explanation!

Project-MONAI / VISTA

Class label mismatch issue (the 127 classes) #41