Closed LiheYoung closed 2 years ago
No, we don't apply any tricks for the supervised training, except the deep-stem blocks and SynBN, which follow the CPS approach. We also maintain the same training iterations between sup/semi settings for the labelled data.
Given that we utilize the teacher network to perform inference for all the settings, one assumption is that the self-ensemble teacher is likely to boost the baseline for even supervised situations.
The code for VOC12 will be released very soon, and we appreciate your interest.
OK, thanks. I notice that according to Figure 3, your supervised baseline results on the Pascal 1/8 setting are ~71 (RN-50) and ~74 (RN-101), while the corresponding results from CPS are 69.43 (RN-50) and 72.21 (RN-101). This indicates the EMA teacher may boost the student by ~1.5%.
I wonder whether the EMA teacher is still superior to the online student by ~1.5% in the final test of semi-supervised setting.
Besides, a little strangely, your Cityscapes supervised baseline results do not show an improvement over CPS.
No, the teachers and student will eventually fall on the similar local minima for the semi-supervised experiments. I believe the mIoU gap should be less than 1.5% for all the partition protocols. By checking our data loader files with CPS (https://github.com/charlesCXK/TorchSemiSeg/blob/f67b37362ad019570fe48c5884187ea85f2cc045/furnace/datasets/BaseDataset.py), I found that we additionally utilize input perturbations (e.g., colour jittering, gaussian blur, and so on). Maybe the various input augmentation also brings some improvement for supervised training?
I apologize for your confusion about the Cityscapes setting. The supervised baseline graph for Cityscapes is based on the CE loss, with a smaller resolution (with 712×712), as shown in section 4.1. The different settings lead to lower results than CPS (with 800x800 and OHEM), but we maintain fair competition with other previous works (e.g., CAC, ECS). We also compare CPS with their setting (as shown in Tab. 2) to prove the effectiveness.
Okay, I got it. Thanks.
Hi,
Did you apply any extra techniques to your supervised baseline, such as setting output stride as 8, auxiliary loss, or OHEM? Since your reported baseline results are very high on the Pascal dataset, according to Figure 3.