Closed baibizhe closed 1 year ago
No problem. We used Mask-RCNN here for its simplicity in adapting pre-trained ResNet models for tumor segmentation. Feel free to let me know if you have any questions about it.
Thanks! My question is Mask-RCNN an object detection and instance segmentation model. The output of Mask-RCNN would be some bounding boxes (x,y,w,h) and its mask. How could it be applied to the semantic segmentation(tumor segmetation) model?
You are right. We regard the bounding boxes as the tumor localization and the corresponding masks as the segmentation results. Actually, instance segmentation is a superclass of semantic segmentation, because the previous one needs to distinguish different instances of the objects. In our experiments, we didn't distinguish different tumors in an image.
Thanks. This is interesting. Is there any specific reason the segmentation model(Unet) isn't applied here? Are they not doing well?
U-Net has a symmetric encoder-decoder architecture, which is hard to fit into our training framework. Meta-USCL is a discriminative training scheme (contrastive learning), so we only need a powerful feature extractor during pre-training. On the one hand, the 5-layer encoder of U-Net would make the pre-training performance undesirable. On the other hand, its heavy randomly-initialized decoder (taking about half of all model parameters) may also make the downstream task suffer a lot. Mask-RCNN has a powerful encoder and a lightweight decoder, and it's a good fit for our pre-training. We hope that as many network parameters as possible can participate in the pre-training process.
Thanks!I agree and appreciate your words.
Hello.Thanks for your great work! Would you mind addressing my confusion about using "mask-rcnn", which is a work focusing on object detection and instance segmentation, on your segmentation task?