In the first stage, you train the teacher model in weak augmentation. However, the model trained in strong augmentation outperform model trained in weak augmentation in your experiment. Why do not use model trained in strong augmentation as teacher model.
In the first stage, you train the teacher model in weak augmentation. However, the model trained in strong augmentation outperform model trained in weak augmentation in your experiment. Why do not use model trained in strong augmentation as teacher model.