dvlab-research / ReviewKD

Distilling Knowledge via Knowledge Review, CVPR 2021
248 stars 34 forks source link

about teacher net #12

Closed yyuxin closed 2 years ago

yyuxin commented 2 years ago

Thank you very much for your work!

I have noticed that before distillation, the teacher networks are loaded with a pre-trained model. Is the teacher network fixed during distillation, I didn't find where this part of the code (like detach or i.requires_grad = False)

akuxcw commented 2 years ago

Hi, please refer to https://github.com/dvlab-research/ReviewKD/blob/master/CIFAR-100/util/misc.py#L84