bethgelab / siamese-mask-rcnn

Siamese Mask R-CNN model for one-shot instance segmentation
Other
346 stars 60 forks source link

Performance is not as good as Mask RCNN when training on a small custom dataset #30

Closed F2Wang closed 3 years ago

F2Wang commented 3 years ago

I have a small dataset of fewer than 200 images, they are different items, but I labeled them only as foreground instances and background. I fine-tuned MASK RCNN and Siamese-mask rcnn using the exact same config, but Mask rcnn shows significantly better performance. The template images I supplied during testing are crops of the instance I used for training, so I don't understand what makes the performance worse on siamese mask rcnn... I will appreciate any suggestions from you, thanks!

michaelisc commented 3 years ago

Sorry for the late reply: Siamese Mask R-CNN is a research model that is intended for one-shot object detection where you really only have a single example of an object. How this can successfully be achieved is an open question and the Siamese Mask R-CNN in this repository is only a first step. Progress is quite fast (see e.g. our most recent paper https://arxiv.org/abs/2011.04267 ) but the problem is a really hard one. 200 labelled images are significantly more information than a single example and it is therefore not unexpected that a fine-tuned Mask R-CNN outperforms our model which is tailored for the one-shot scenario.