Performance in correct Mask R-CNN implementation

wondervictor commented 5 years ago

Hi, thanks for your nice work and I appreciate it much. I have some questions about your implementation. Your implementation is based on Mask_RCNN but I found this implementation might exist many problems which led to much lower performance than the official implementation(Detectron, maskrcnn-benchmark).
I'm interested in your work and try to implement your proposed method in maskrcnn-benchmark. The only difference between my implementation and yours is that I use abs() instead of sqrt() to aggregate edges detected from X-direction and Y-direction because sqrt will result in numerical problems. sqrt and abs is nearly the same in the theory. And I obtained results below and my implementation is consistent with the official implementation.

model	loss	box AP	mask AP
MaskR-CNN-ResNet50(baseline)	-	36.5	33.2
MaskR-CNN-ResNet50(EdgeAgreementLoss)	L1 Loss	36.8	33.6
MaskR-CNN-ResNet50(EdgeAgreementLoss)	L2 Loss	35.6	33.6

And I wonder why I can't obtain the performance gain in your paper. Can you provide results obtained from more accurate implementations?

wondervictor commented 5 years ago

@FlashTek Hi, waiting for your response.

zimmerrol commented 5 years ago

Hi @wondervictor, we are aware that the M-RCNN implementation by matterport does not yield the same (SOTA) results as other implementations. Nevertheless, we did not consider this to be a problem, as there are papers that have already used the idea of the Edge Detection Head to improve their instance segmentation accuracy in different domains and using different models (e.g. here). Therefore, we are fairly sure that this is a general property of instance segmentation networks and is not just restricted to the implementation done by matterport.

Furthermore, we are not sure what you mean when you're talking about the sqrt/abs problem. Do mean that you combined the contributions in the x- and y-direction and only used the magnitude? If this is the case, please take a look at the section in our paper which describes failed experiments - there is a statement about this, which basically says exactly the same as you are: only using the magnitude does not bring a strong improvement compared to a baseline.

Otherwise, please noticed also that we have shown in the paper that the performance-gain is very sensitive to the exponent p used in the L-p norm. Also, please be aware that we did not use the L-p norm but the L-p norm to the power of p, (cf. eq. 5) - we are sorry if this has caused any confusion on your side.

Lastly, please send us a complete overview of your configuration (of the base M-RCNN and the newly added Edge Agreement Head) as well as a figure containing the loss curves (edge agreement loss, mark loss) and some samples of the predicted masks w/ and w/o the Edge Agreement Head so that we can better understand your problem.

wondervictor commented 5 years ago

Thanks for your reply. I have a new understanding of your method after your explanation. I ignore the detail that magnitude is not adopted in your method. I've rechecked your code and grasped more details already. I'll fix bugs in my implementation and continue some experiments. Thanks!

zimmerrol / mask-rcnn-edge-agreement-loss

Performance in correct Mask R-CNN implementation #3