lhoyer / MIC

[CVPR23] Official Implementation of MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
261 stars 39 forks source link

Teacher model for inference #79

Open kimkj38 opened 4 months ago

kimkj38 commented 4 months ago

Hi. I have a question about inference. As I understand, teacher model is more robust than student model by EMA update and it is used for supervision(pseudo label). Then why do you use student model for inference? What is the problem if we use teacher model for inference?

lhoyer commented 3 months ago

Hi @kimkj38,

I did not evaluate the teacher performance for MIC. However, in previous UDA studies, we found that the student and teacher have similar performance at the end of the training. The EMA teacher is particularly important in the beginning of the training, when the network is learning rather quickly and the predictions change quickly, to ensure temporally stable pseudo-labels for stable self-training. When the training converges in the end and the learning rate is decayed, there is less instability and both student and teacher converge to the similar predictions.

Best, Lukas