SUYEgit / Surgery-Robot-Detection-Segmentation

Object detection and segmentation for a surgery robot using Mask-RCNN on Python 3, Keras, and TensorFlow..
Other
181 stars 84 forks source link

train network without using VIA for masks #2

Closed nabsabraham closed 6 years ago

nabsabraham commented 6 years ago

I have a dataset of images for segmentation where the region of interest is labelled 1 and the background is labelled 0 in the ground truth mask. How can I incorporate this data into the model for training? Would I have to reannotate the training images or is there a better way to use your model for my data?

SUYEgit commented 6 years ago

Hi, thanks for sharing your issues. According to your description, your dataset is for binary segmentation rather than instance segmentation, right? If I understand correctly, you may need to reannotate your dataset if you really want to use Mask RCNN for your task. If you only plan to do a binary classifications without distinguishing different instances of the same class, I will suggest you to try some other model such as FCN, UNet and so on, for which you don't need to rebuild annotations.(which is painful, I know)

nabsabraham commented 6 years ago

Thanks for the prompt reply! Yes it is a binary segmentation problem and I've tried U-Net and FCN but the results are not promising. This paper used a variant of Mask R-CNN for a binary segmentation problem with really good results so I was hoping to try the RCNN approach. Would 'rebuilding annotations' would involve creating the object mask AND indicating a bounding box region or just the mask annotation? I'm a little confused about the inputs to the Mask RCNN model.

SUYEgit commented 6 years ago

Hi, this model doesn't require you to annotate the bounding boxes. Bounding boxes are automatically generated. You only need to annotate the mask of instances as well as the class of each instance. For example, if you have image of (1600,1200) and 5 instance in the image annotated, the load_maskfunction in surgery.py will generate mask array of size (1600,1200,5), each channel indicating the mask of one instance by boolean, an a list of class ids such as [1,1,3,5,2]. Sorry for late response, I just saw you question. Hope this could help you. If you have any other difficulty implementing the code, please feel free to ask me.