CuriousAI / mean-teacher

A state-of-the-art semi-supervised method for image recognition
https://arxiv.org/abs/1703.01780
Other
1.56k stars 331 forks source link

Applying approach for NLP problem #56

Open tarunbhatiaind opened 2 years ago

tarunbhatiaind commented 2 years ago

I am planning to apply mean-teacher for my problem of token classification. Since adding different noise for teacher and student is really important for the approach, i am confused about how to calculate consistency cost as length of active logits would differ. for e.g. if i use synonym noise then it can happen that it increases the length of the sentence (some tokens maybe replaces by synonym of len 2) when given to teacher model and same augmentation may generate different sentence(of different length) when given to student model.