Open ghost opened 6 years ago
I would also be interested in this capability
@dharma-kc, @timbrucks
In this case, I would argue that there is no benefit in using Mask R-CNN in the first place. It would be more straightforward to start with a Faster R-CNN implementation. Then, after fine-tuning the Faster R-CNN network, incorporate ROI Align in place of ROI Pooling if you want slightly better perfomance (see Table 3 in Mask R-CNN paper).
If you want to use Mask R-CNN, then you have to turn off layers and losses relating to the masks. For example, build_fpn_mask_graph() and mrcnn_mask_loss_graph(). You also have to change the way the image ground-truths are loaded, especially the part where the ground-truth bounding boxes are calculated from the ground-truth masks (as opposed to just using the provided ground-truth bounding boxes).
Thanks for the feedback @FruVirus. I will give that a shot. I will say that Mask R-CNN does give excellent results out of the box!
@timbrucks , indeed it does! What I also like about Mask R-CNN is that the implementation is very modular and it takes you through some of the foundational steps in object detection and instance segmentation.
@timbrucks Thank you for your suggestion. I would like to enable the ROI Align to faster RCNN and remove ROI Pooling layer then. If you have experience with that could you tell me the places where I should change for that capability? Otherwise I will try on my own. Thank you.
@FruVirus I have turned the layers related to segmentation off, as well as some parts of the input, including reading from the ground truth of rectangles instead of calculating from masks. Then I trained the model , and I found that the rpn_bbox_loss become nan after several epochs' iteration. How to fix it then? Thanks!
I am working on Mask R-CNN, I am training images. I have 1 + 1 class.In my JSON I have 2 shape: 'rect ' and 'polygon'. My not working when I added 'rect' also on same data set. Let how to handle the different shapes. My program only works with polygon shapes
@daoud
Well, it might be too late, but anyway, here is what you can do.
Assuming you are loading mask as written in the repo and dataset is annotated using VIA, you can add the following lines to the load_mask
fn
after for i, p in enumerate(info["polygons"]):
if p['name'] == 'rect':
p['all_points_y'], p['all_points_x'] = [p['y'], p['y'] + p['height'], p['y'], p['y'] + p['height']], [p['x'], p['x'] + p['width'], p['x'] + p['width'], p['x']]
@amankhandelia , you might have reversed two entries in the all_points_y: all_points_x = [p['x'], p['x'] + p['width'], p['x'] + p['width'], p['x']] all_points_y = [p['y'], p['y'], p['y'] + p['height'], p['y'] + p['height']]
instead of your statement: all_points_x = [p['x'], p['x'] + p['width'], p['x'] + p['width'], p['x']] all_points_y = [p['y'], p['y'] + p['height'], p['y'], p['y'] + p['height']]
@FruVirus @timbrucks Hello Can you explain how I train the mask rcnn for the object and mask detection task. let me explain: a rectangular delimitation frame around the objects that I have to detect and mask it. I don't need labels and confidence scores
Any suggestion would be highly appreciated. thank you,
@dharma-kc, @timbrucks
In this case, I would argue that there is no benefit in using Mask R-CNN in the first place. It would be more straightforward to start with a Faster R-CNN implementation. Then, after fine-tuning the Faster R-CNN network, incorporate ROI Align in place of ROI Pooling if you want slightly better perfomance (see Table 3 in Mask R-CNN paper).
If you want to use Mask R-CNN, then you have to turn off layers and losses relating to the masks. For example, build_fpn_mask_graph() and mrcnn_mask_loss_graph(). You also have to change the way the image ground-truths are loaded, especially the part where the ground-truth bounding boxes are calculated from the ground-truth masks (as opposed to just using the provided ground-truth bounding boxes).
i have a thought that maybe easy to apply: for only box detection, i will fake a mask from the box area, then train the network keeping code no change , but set mrcnn_mask_loss of LOSS_WEIGHTS in the config to 0
@Fruvirus Is it possible to do the other way around i.e. use ground truth bounding boxes to do the instance segmentation i.e. create masks.
for i, p in enumerate(info["polygons"]):
# Get indexes of pixels inside the polygon and set them to 1
if p['name'] == 'polygon':
rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
elif p['name'] == 'rect':
p['all_points_y'], p['all_points_x'] = [p['y'], p['y'], p['y']+p['height'], p['y']+p['height']], [p['x'], p['x']+p['width'], p['x']+p['width'], p['x']]
rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
elif p['name'] == 'circle':
rr, cc = skimage.draw.circle(p['cx'], p['cy'], p['r'])
rr[rr > mask.shape[0]-1] = mask.shape[0]-1
cc[cc > mask.shape[1]-1] = mask.shape[1]-1
mask[rr, cc, i] = 1
return mask, np.ones([mask.shape[-1]], dtype=np.int32)
Hello Everyone,
Can you explain how do I train this detectron for object detection task only. By object detection only, I mean I have a rectangular bounding box around objects I need to detect. I don't have a mask for object and I don't even want that in my inference.
All I want is to train it like object detection frameworks like Yolo and SSD, just with rectangular bounding box around object.
Any suggestions will be highly appreciated. Thank you, Regards, Dharma KC