Seungho Lee1, , Minhyun Lee1,, Jongwuk Lee2, Hyunjung Shim1
* indicates an equal contribution
1 School of Integrated Technology, Yonsei University
2 Department of Computer Science of Engineering, Sungkyunkwan University
Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries. We devise a joint training strategy to fully utilize the complementary relationship between both information. Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks.
12 Jul, 2021: Initial upload
19 Aug, 2021: Minor update on information about dCRF and the pre-trained model of the segmentation networks
28 Aug, 2021: Major updates about MS-COCO 2014 dataset and minor updates (cleanup)
15 Apr, 2022: Minor update on information about the method setting up 'cls_labels.npy' the for ms-coco 17 dataset
22 Feb, 2023: Minor update on the download link for coco dataset (Masks, Saliency maps)
PASCAL VOC 2012
MS-COCO 2014
Pretrained models
MS-COCO 2017
Execute the bash file for training, inference and evaluation.
# Please see these files for the detail of execution.
# PASCAL VOC 2012
# Baseline
bash script/vo12_cls.sh
# EPS
bash script/voc12_eps.sh
# MS-COCO 2014
# Baseline
bash script/coco_cls.sh
# EPS
bash script/coco_eps.sh
We provide checkpoints, training logs, and performances for each method and each dataset.
Please see the details from the script files.
Dataset | METHOD | Train(mIoU) | Checkpoint | Training log |
---|---|---|---|---|
PASCAL VOC 2012 | Base | 47.05 | Download | voc12_cls.log |
PASCAL VOC 2012 | EPS | 69.22 | Download | voc12_eps.log |
MS-COCO 2014 | Base | 31.23 | Download | coco_cls.log |
MS-COCO 2014 | EPS | 37.15 | Download | coco_eps.log |
dCRF hyper-parameters
CRF parameters: bi_w = 4, bi_xy_std = 67, bi_rgb_std = 3, pos_w = 3, pos_xy_std = 1.
We utilize DeepLab-V2 for the segmentation network.
Please see deeplab-pytorch for the implementation in PyTorch.
We used the pretrained model for VGG16 based network from DeepLab official and for ResNet101-based network from OAA official.
This code is highly borrowed from PSA. Thanks to Jiwoon, Ahn.