wenxi-yue / SurgicalSAM

[AAAI2024] Official implementation of SurgicalSAM
MIT License
58 stars 8 forks source link

Inference problems #15

Open bofang68 opened 1 month ago

bofang68 commented 1 month ago

Hi, I read your inference function. I noticed you first gave the program the specific class before inference (e.g., predicting category 3 for image 1). Is this controversial because it reduces the possibility of misclassification? For example, if image 1 has only class 3 and class 7, it won't produce a predicted mask for class 2 (even though it cannot distinguish between classes 2 and 3, 7). In other words, the authors tell the model which classes the predicted image contains (based on the true labels). Yet most images contain only a few classes.

As far as I'm concerned, it's fair to predict all the classes of an image (e.g., from category 1 to 7) one by one. If it performs well, this model should not predict classes that the image does not contain.

Looking forward to your reply.

1716479236589

wenxi-yue commented 1 month ago

Hi, Thanks for your interest.

We think that these represent two different settings. The setting you mentioned pertains to traditional semantic or instance segmentation, whereas in our paper, we adopt a promptable setting where the user specifies the class of interest.

Due to the different settings, we divide the methods in our comparison into specialist models (with a traditional setting) and SAM-based models (with a promptable setting). We consistently maintain a promptable setting with all SAM-based models to ensure a fair comparison, and we explicitly clarify this in the paper: "Challenge IoU measures the IoU between the predicted and ground-truth masks for only the classes present in an image, whereas IoU is computed across all classes. In our class promptable segmentation setting with class prompts provided, Challenge IoU and IoU yield identical results."

bofang68 commented 1 month ago

Thank you for your response. I have an additional problem with the replication process. I followed data_prepocess.py to preprocess the images and labels of the training set and get the format (class_embeddings_h and sam_features_h) consistent with the uploaded dataset. The results for the validation set are poor. Why is this? Should I do the processing for both the validation set and the training set together? image

bofang68 commented 1 month ago

I can understand that you trained through a subset of 40 groups and found that a certain folder had the best training (named it 0) right?

wenxi-yue commented 1 month ago

Thank you for your response. I have an additional problem with the replication process. I followed data_prepocess.py to preprocess the images and labels of the training set and get the format (class_embeddings_h and sam_features_h) consistent with the uploaded dataset. The results for the validation set are poor. Why is this? Should I do the processing for both the validation set and the training set together? image

For the validation set, you can use the provided validation data directly without any additional processing.

Regarding the poor performance, I may need more details to identify the problem accurately. However, I recommend double-checking that the images and annotations in the training and validation sets are consistent?

wenxi-yue commented 1 month ago

I can understand that you trained through a subset of 40 groups and found that a certain folder had the best training (named it 0) right?

Folder 0 is not the folder with the best performance; it is the original data without any data augmentations. Please read this comment for more detailed explanations of what these folders of different versions represent.