Poor mask prediction on image's boundary

enchantress1016 commented 6 years ago

Hi, I tried different configs with the network and realized for a lot of time the mask prediction would lose the mask on the image boundary. My mask usually would be a rectangular shape lying on several rows, from very left to very right of the image, but the predicted masks sometime would either miss detect the very left or the very right parts, for several pixels. Any idea about why this would happen and how I could change the config to improve? Thanks!

FYI, all my training and testing data have the same size (1024, 1024, 3) (so I think probably I don't need to pad anymore) And I have the config as following: Configurations: BACKBONE resnet101 BACKBONE_STRIDES [4, 8, 16, 32, 64] BATCH_SIZE 1 BBOX_STD_DEV [0.1 0.1 0.2 0.2] DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.98 DETECTION_NMS_THRESHOLD 0.3 GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 IMAGES_PER_GPU 1 IMAGE_MAX_DIM 1024 IMAGE_META_SIZE 14 IMAGE_MIN_DIM 1024 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE none IMAGE_SHAPE [1024 1024 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 0.001 LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_mask_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_bbox_loss': 1.0} MASK_POOL_SIZE 70 MASK_SHAPE [56, 56] MAX_GT_INSTANCES 100 MEAN_PIXEL [123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME Mtc NUM_CLASSES 2 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 1000 POST_NMS_ROIS_TRAINING 2000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.5, 1, 2] RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) RPN_ANCHOR_STRIDE 1 RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.7 RPN_TRAIN_ANCHORS_PER_IMAGE 512 STEPS_PER_EPOCH 100 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 512 USE_MINI_MASK False USE_RPN_ROIS True VALIDATION_STEPS 50 WEIGHT_DECAY 0.0001

fastlater commented 6 years ago

@enchantress1016 May I ask you what is the effect of MASK_POOL_SIZE = 70 and MASK_SHAPE = [56, 56]? The original script values are 14 and [28,28] respectively. Do they increase the accuracy in 1024*1024 images?

About your question, one suggestion: Did you try to check your resulted model with https://github.com/matterport/Mask_RCNN/blob/master/samples/nucleus/inspect_nucleus_model.ipynb

btw, dont you think your DETECTION_MIN_CONFIDENCE is too high? This mean that the model has to be 98% sure that x pixel belongs to y class so it may be causing the error in the border. It is just my opinion.

enchantress1016 commented 6 years ago

@fastlater Hi, thank you for your suggestion! I'll read through the link you posted, hope it can inspire me :)

Regarding of mask_shape [56, 56] and mask_pool_size 70, I adjusted it to increase the resolution of my mask. I have some small variations on the mask edge but the original config was not able to detect it. After making these changes, you can see a clear resolution increase on your detected masks, although I need it to be even better ;)

fastlater commented 6 years ago

@enchantress1016
Get good results is easy with this state of art script. However, reach a really high accuracy will need some extra work. May some preprocessing or postprocessing. I read this post from the kaggle's dsb forum https://www.kaggle.com/c/data-science-bowl-2018/discussion/54741 and helped me a lot. Actually, this competition's forum is full of discussions with good training tips. I still don't reach the desired accuracy either but I will try changing those mask shape and mask pool size and see what happen since my input images have the same size than yours and the rest of my parameters are almost the same than yours. Did you used imagenet, coco or trained from scratch? For me, COCO is slightly better than Imagenet but testing is the only way to know which one fits better your data.

enchantress1016 commented 6 years ago

@fastlater I used coco as my starting point, for my dataset Imagenet performs much worse than coco but I didn't figure out the reason for it. Remember when you change the mask shape, you also need to change the architecture by adding an additional deconv layer in model.py. I did see a clear performance improvement with higher mask_pool_size value when do the testing. However, when training the network the max mask_pool_size I could use is 28 (otherwise there'd be OOM error). So I trained on mask_pool_size 28 and predicted on mask_pool_size 70, maybe you can try increasing the mask_pool_size in training if your hardware allows.

fastlater commented 6 years ago

I see, about the weights, well, some cases Imagenet is better, other cases COCO is better so there is nothing strange with that. You can try using a different optimizer, like ADAM for example. Some research papers used ADAM as optimizer. But is it just like the starting weights issue, depends on case.

enchantress1016 commented 6 years ago

@fastlater That's a really good point. I thought the code used ADAM by default so I never thought about changing the optimizer. Also, I tried to use weighted DICE+BCE as loss function, but didn't see clear improvement for my data. Maybe you can try it on yours, as some people indeed got better boundary detected.

florian-koenig commented 5 years ago

@fastlater That's a really good point. I thought the code used ADAM by default so I never thought about changing the optimizer. Also, I tried to use weighted DICE+BCE as loss function, but didn't see clear improvement for my data. Maybe you can try it on yours, as some people indeed got better boundary detected.

@enchantress1016 Can you please point me to a description of how to use DICE+BCE? I'd like to try if it produces better boundary in my use case as well. Thanks!

K-M-Ibrahim-Khalilullah commented 5 years ago

@waleedka How can I improve predicted Mask. Because My detected bounding box is good but Mask is not good. How many ways to improve Mask?

Thanks

banafsh89 commented 5 years ago

@waleedka How can I improve predicted Mask. Because My detected bounding box is good but Mask is not good. How many ways to improve Mask?

Thanks

@ibrahimLearning were you able to improve your predicted mask? I have the same problem.

matterport / Mask_RCNN

Poor mask prediction on image's boundary #1027