zhouzaida / channel-distillation

PyTorch implementation for Channel Distillation
100 stars 17 forks source link

question about function adjust_loss_alpha #5

Closed UcanSee closed 4 years ago

UcanSee commented 4 years ago

I found weight of ce loss and kd loss is zero in first 30 epoches when training on ImageNet in your code ,(https://github.com/zhouzaida/channel-distillation/blob/2464d30ba01e11491e520e51be498c91f1e54b91/utils/util.py#L9)) I want to know why at the start of training, original ce loss is set to zero?

zgcr commented 4 years ago

I found weight of ce loss and kd loss is zero in first 30 epoches when training on ImageNet in your code ,(

https://github.com/zhouzaida/channel-distillation/blob/2464d30ba01e11491e520e51be498c91f1e54b91/utils/util.py#L9

)) I want to know why at the start of training, original ce loss is set to zero?

The first 30 epochs ce loss weight is set to 0 because we want to use teacher model to initialize student model weights by using cd loss.