I read the code of your method. I feel confused about the instance segmentation. As usual, the instance segmentation has class labels in each pixel. But I thought maybe there are just masks without class labels in your paper? Because for SAM, when we just have bounding boxes, you just can get the mask (without class information) and you use bounding boxes in SAM directly without any semantic information.
When computing the mIoU, just the mask and binary ground truth without the specific class information?
Thank you for your help! I feel confused so much!
I am looking forward to your reply!
Yes, because SAM is a class-agnostic segmentation model. As for multiple prompts, the order of the output masks is consistent with the order of prompts.
Hi! Thank you for your great work!
I read the code of your method. I feel confused about the instance segmentation. As usual, the instance segmentation has class labels in each pixel. But I thought maybe there are just masks without class labels in your paper? Because for SAM, when we just have bounding boxes, you just can get the mask (without class information) and you use bounding boxes in SAM directly without any semantic information. When computing the mIoU, just the mask and binary ground truth without the specific class information?
Thank you for your help! I feel confused so much! I am looking forward to your reply!