microsoft / ProDA

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)
https://arxiv.org/abs/2101.10979
MIT License
286 stars 44 forks source link

Question about structure learning #4

Closed super233 closed 3 years ago

super233 commented 3 years ago

I noticed that weak and strong augment have been used in structure learning.

In my opinion, the difference between strong augment and original image is greater than weak augment, and why you use weak augment but not original image? Did you do ablation study about weak augment of different level even not augment?

theo2021 commented 3 years ago

Hi, the author can answer the question better, but this technique is commonly used in contrastive learning 1, 2. In contrastive learning, you train based on the idea that two different inputs that were produced by the same sample should lead to the exact output. It is a self-supervised technique that aims to create representations that are close to each other (for augmented samples).

Indeed sometimes the weak and strong are closer together compared to the strong and original. But in most of the cases, they are further apart, since the transformations are random, it is rare to have similar ones. Moreover, having weak transform ads further randomness that can benefit learning in the long run. In the author's case, the network learns a shared feature representation across augmentations while in the one you describe the network learns how to clean the transformations, so as to be as close as possible to an original image.

super233 commented 3 years ago

Thanks for your answer, I'm a green hand in UDA. 🏃‍♂️