Open ZhengRui opened 6 years ago
Hi, have you solved this problem yet? I need the pretrained model of efficient version too, can you help me? thanks @ZhengRui
Finally I decided to use the Pytorch pre-trained model, https://github.com/gpleiss/efficient_densenet_pytorch/issues/13
I tried to convert pretrained densenet121 model provided in https://github.com/shicai/DenseNet-Caffe
to efficient version obeying your DenseBlock naming convention. I have the following prototxt (efficient_densenet121.prototxt) and script to copy params (from standard DenseNet_121.prototxt and corresponding model) until the end of first transition layer:
script to test copying params:
However, when I run the above script, the mean and std of 'concat_2_6' layer does not match, not very big difference but obviously some issues exist when copying parameters, especially I found the result of efficient version seems irrelevant to the value of final parameter of DenseBlock layer. In https://github.com/Tongcheng/caffe/blob/master/src/caffe/layers/DenseBlock_layer.cpp#L159 it says it is related to batch norm layer's moving average factor, I am wondering is it because in standard densenet all batch norm layers's last parameter is the same, so in this efficient implementation it only has one parameter? Other than that, I still don't know why its value doesn't affect output and what will be the correct mapping of parameters from standard model to efficient model?