Open sanyouwu opened 5 years ago
@xinmei9322
@SanyouWu Sorry for the late reply. I just saw the message. I set the ramp-up length to 5, the same as that in the original Mean-Teacher code. Actually, the implementation is based on their public code and the hyper-parameters remain the same except the new hyper-parameter we introduced.
Alright, thanks. But I think it is more reasonable for increase the value of w(t) instead of 5. You know, the accuracy of the classification is low at the start of training. I meaning, why not you use two ramp-up function w1(t) and w2(t), and w1(t) (5 epochs)is for consistency loss while w2(t) used for contrastive loss(larger than 5 epochs). @xinmei9322
Hi, thanks for your contributions! Meanwhile, I am curious about the warm ramp-up function w(t) when you use Mean-Teacher model+ your SNTG. Did you set the ramp-up length is 80 epochs (in your paper Appendix A) or 5 in (Mean-Teacher code).