Closed 376498485 closed 1 year ago
Hi, there are three main differences: 1) We didn't fine-tune the CLIP model in the latest implementation. 2) The backbone network including 1x1 convolutional layer was purely initialized from the pre-trained multi-label classification model 'res50_cam.pth'. 3) The CBS Loss. Please check more details in the source code.
If you would like to cite our paper, please cite the performance in the CVPR version. We will update the paper and upload code for COCO in the coming month.
BTW, welcome to star our project! :)
Actually, with SAME DeepLabv2, the performance is boosted from ~69 to ~70 (imagenet pre-trained) and ~70 to 71.4 (coco pretrained).
| CLIMS(camera-ready) | I | DeepLabV2 | ImageNet | 69.3 | 68.7 | | CLIMS(camera-ready) | I | DeepLabV2 | COCO | 70.4 | 70.0 | | CLIMS(this repo) | I | DeepLabV2 |ImageNet | 70.3 | 70.6 | | CLIMS(this repo) | I | DeepLabV2 | COCO | 71.4 | 71.2 |
Thank you very much! I will follow your next work.
Hi, thanks for your great work. Could you please tell me the difference between previous version with new version?What have you done to improve your miu from 70 to 73?