Open won-bae opened 3 years ago
Hi @won-bae,
Yes, we did use the default setting provided in the official CCT repository.
Thank you for the reply!
Unfortunately, when I trained the CCT model using the default config (use_weak_lables: true
) with the replacement of weak_labels_output
from pseudo_labels
(IRN results provided in the CCT repo) to sem_seg
(AdvCAM + IRN if my understanding is correct), I only got 74.8 which seems significantly lower than 77.8.
To generate sem_seg
, I simply followed step2 and 3 using the pre-trained classifier and got the evaluation results of 55.5 and 69.3 on training set for cam_adv_mask
and sem_seg
, respectively, which seems about right based on Table1. Do you have any idea about what went potentially wrong with my trial? Any suggestions would be appreciated.
Sorry for the late reply.
We used the multi-scale inference provided by the CCT official repository, and we apply CRFs. You can refer to deeplab-pytorch repository (https://github.com/kazuto1011/deeplab-pytorch) for more details.
Thank you.
No worries! Thank you for your reply. So does that mean you applied the multi-scale on validation set? I am asking this because as far as I know, CCT applies it only on test images (inference.py in CCT).
Hi @won-bae, sorry for the late reply.
Yes, I have applied multi-scale testing on validation images.
Thank you.
In the CCT paper, they reported 73.2 which is based on epochs: 50
. In their official repo, the default config for epochs is 80. The difference in epochs makes a significant change in performance.
Similarly, CRF and mutli-scale on test time indeed give significant improvement. Here are some experiment results on AdvCAM with different tweaks.
Epochs | Multi-Scale | CRF | AdvCAM (mIoU) |
---|---|---|---|
80 | True | True | 77.8 (reported) |
80 | True | True | 77.6 |
80 | True | False | 77.1 |
80 | False | True | 76.1 |
80 | False | False | 74.8 |
50 | False | True | 75.8 |
50 | False | False | 74.5 |
Given that what you reported for CCT baseline (73.2) is compatible to the last row in the table above (74.5), I am not sure if you can claim "the performance of our method on the validation images was 4.6% (73.2 -> 77.8) better than that of CCT". Or am I missing something here?
Could you clarify this? @jbeomlee93
Hi thanks for sharing a great work!
In the previous issue, you said you used the CCT's code for semi-supervised learning with the replacement of pseudo labels. Does that mean you used the default configuration provided in CCT?