Concerning the overfitting problem in PCNet-M and PCNet-C

XiaohangZhan / deocclusion

Code for our CVPR 2020 work.

Apache License 2.0

794 stars 104 forks source link

Concerning the overfitting problem in PCNet-M and PCNet-C #14

Closed sydney0zq closed 4 years ago

sydney0zq commented 4 years ago

Hi Xiaohang,

Thanks for your releasing code and the demo is really amazing. Recently I test few images on COCO validation set based on your pre-train models, also I use the demo images you used. However the results are frustrated and far satisfied, could you please check it?

This is COCOA/2.jpg and COCOA/2.json ground-truth annotation.

This is COCOA/2.jpg and CenterMask instance segmentation results.

The output segmentation result is slightly different with the ground-truth, but, as we can see, the instance is not completed well.

sydney0zq commented 4 years ago

CenterMask prediciton: https://1drv.ms/u/s!Am-RqVBo6TOQhp1m_x1xRUJ5bLrfQg?e=lfPtdk

XiaohangZhan commented 4 years ago

Hi, the reason is, when you use predicted masks, there will be slight margins between the target object to complete and the occluders. Recall that in training, we use surrogate objects as occluders, thus the trimmed modal mask and the surrogate occluder fit each other tightly. This gap results in this situation. It is easy to fix. You need to adjust the dilate_kernel in both ordering recovery and amodal completion as 3,5,7, or larger, respectively in infer.infer_order and infer.infer_amodal. It expands the occluder masks to fill in the margins. Moreover, you may also want to adjust the th in these two functions to control the response level.

sydney0zq commented 4 years ago

Hi Xiaohang, Thanks for your quick reply. I have changed to infer.infer_order (in get order_matrix), infer.infer_amodal (in get amodal prediction)'s dilate_kernels to 7, and rerun the program again. Could you please check the results are obtained correctly or not?

02_COCO_val2014_000000052425 03_COCO_val2014_000000063650 05_COCO_val2014_000000542960

Thanks again.

XiaohangZhan commented 4 years ago

Yes they are almost correct. In the first case, the modal mask is not so fine-grained, the upper part in the left is missing, while is also not included in the occluder, hence the upper-left corner of the car cannot be completed. I guess it will be better with a more accurate modal mask. You can try dense crf to refine the modal mask. In the second case, the person in front of the car is not detected, and not included as an occluder, hence the part of the car occluded by this person cannot be completed. The third one is almost as what we expect.

sydney0zq commented 4 years ago

Thanks for your detailed reply. I will make more exploration on this interesting task. :)