Closed IfuChan closed 2 years ago
Please refer to this config:https://github.com/megvii-research/mdistiller/blob/master/configs/cifar100/vanilla.yaml (change the "student" to "resnet32x4")
Thanks for replying. I tried that, the issue is after training, when using the DKD strat to train the student, I get Missing keys and Unexpected keys error on train.py on load_state_dict. The model I trained has keys module.student.conv1.weights ... and the missing ones are conv1.weights etc.
There would be two types of models saved(, student). The ones with 'student' in their name are suitable as pretrain models.
Worked, thank you. Also, wanted to ask why are the size of the models with 'student' in their names smaller than the ones without it? Since here we are using Vanilla, so no distillation is happening so I do not understand why it would be smaller?
We save all the parameters(of the student, the teacher, and the connector/contrastive memory..) for the checkpoints. Vanilla models are also saved with the teacher's parameters(useless), and we will consider to fixing that.
Understood, thank you very much.
Hi, is there a way to train a single model like resnet32x4 on cifar 100 on this repo? Want to train the models from scratch without using the pretrained models.