haitongli / knowledge-distillation-pytorch

A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
MIT License
1.86k stars 343 forks source link

in mnist folder,why teacher_mnist and stdudent_mnist do not contain the softmax? #27

Open parquets opened 4 years ago

briliantnugraha commented 4 years ago

Hi @parquets, FYI, they add softmax when they compute the loss function, see kd loss here https://github.com/peterliht/knowledge-distillation-pytorch/blob/master/model/net.py#L110-L112

parquets commented 4 years ago

Hi @parquets, FYI, they add softmax when they compute the loss function, see kd loss here https://github.com/peterliht/knowledge-distillation-pytorch/blob/master/model/net.py#L110-L112

thank you, I get it

parquets commented 4 years ago

@peterliht Could you give me pre-trained model for ResNet110?