Inferior performance on PASCAL VOC12 with DeepLabV3+

WXinlong / DenseCL

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021 Oral.

GNU General Public License v3.0

544 stars 70 forks source link

Inferior performance on PASCAL VOC12 with DeepLabV3+ #3

Closed syorami closed 3 years ago

syorami commented 3 years ago

Thanks for revealing your code and the results are impressive.

I've tried the downloaded DenseCL pretrained models and tested on the VOC semantic segmentation dataset. When using the same FCN architecture, the result performance matches the expectation. The DenseCL ImageNet pretrained model outperforms the ImageNet classification model. However, when replacing the backbones of DeepLabV3+, the DenseCL model showed inferior performance. The results comparisons are as below:

Arch	Dataset	Pretrained Model	mIoU
dv3+	VOC12	Sup ImageNet	71.33
dv3+	VOC12	DenseCL COCO	67.51
dv3+	VOC12	DenseCL ImageNet	69.5

The configs are borrowed from the official configs of MMSEG and I carefully tried to not make much modifications. Wondering if you have ever noticed same behavior on any other models or datasets?

WXinlong commented 3 years ago

@syorami What's your config? There should be a mismatch in the config. Please make sure you use the same ResNet architecture as in our provided config.

syorami commented 3 years ago

Yes. I've noticed the difference of backbones between your implementation and MMSEG's so I merged the ResNetNormal into the registry. There're are no errors with loading the pretrained model. My complete config is attached: config.zip

The dataset and schedule configs are as the same as the configs provided by MMSEG except the training iterations, as I found that 3k iterations were already enough to train a VOC12 baseline. The experiments were conducted on 8 V100 GPUs and took around 1.5h. I trained MMSEG's dv3+ without modifications on PASCAL VOC12 previously and got a result of around 72.5 mIoU. So I think 71.33 mIoU is a fairly close result here when considering the backbone difference.

It would be appreciated that if you could help check the config or give some advice.

WXinlong commented 3 years ago

Can you post the results of MoCov2 model, which can be downloaded from openselfsup? And also the DeepLabV3+ training logs of these three pre-trained models. I will try to help.

syorami commented 3 years ago

I didn't conduct experiments of MoCov2 but I will give a try.

WXinlong commented 3 years ago

I'm closing this issue for no requested information.