Open betogulliver opened 4 years ago
Q1. For ESPNet, I change self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=False), and retrain the teacher net. Q2 The input of the D is 1/8 of the RGB. Q3. a. Pixel wise are on the logits. b. For the student, I use the 1/8 scale feature. [2, 256, 64, 64]). For the teacher, I use the feature after PSP module. To be honest, for the distillation of the ESPNet, as described in the paper, I use the original training code of ESP project and add the distillation module on to that project.
thanks for your great work.
I'm trying to train ESPNet using the 'master' branch, without success.
I tried to "recycle" some of the models/code from the 'cvpr2019' branch (networks/ESPNet.py) and use this as 'student'.
However since the ESPNet model has different architecture from the 'teacher' (ResNet101) I got a mismatch while trying to compare the features in the 'student_backward()' function. student : ESPNet : 3 features
teacher : ResNet101 : 7 features
this is to be expected but even my features shapes don't seem to match at all (for details: see at the end of this message)