liuzhuang13 / DenseNet

Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).
BSD 3-Clause "New" or "Revised" License
4.71k stars 1.07k forks source link

Will you release the pre-train models on Caffe #10

Open soeaver opened 7 years ago

soeaver commented 7 years ago

Will you release the pre-train models on Caffe

liuzhuang13 commented 7 years ago

Thanks for your interests! We'll work on it as soon as possible. Hopefully the model will come out this year.

kaishijeng commented 7 years ago

Is imagenet caffe model available yet?

Thanks,

liuzhuang13 commented 7 years ago

Sorry for the late response. Our pretrained DenseNet-121 and prototxt in Caffe has just been released here https://github.com/liuzhuang13/DenseNet#imagenet-and-pretrained-models

Thanks for your interest!

shicai commented 7 years ago

@liuzhuang13 I just tested the pretrained DenseNet-121 caffe model from Zhiqiang Shen, and the top-1/5 acc by using single center crop is only 70.8/90.3%(256xN) or 72.5/91.3%(256x256). is there something wrong? it should be 75.0/92.3%, right?

liuzhuang13 commented 7 years ago

@shicai Thanks for using our models! Yes, we are aware of this issue. Your 72.5%/91.3% accuracy (256x256) is the same as what we get. The difference may come from a different data augmentation scheme and other implementation difference between fb.resnet.torch and Caffe, despite we have tried to keep other parameters as consistent as possible. On CIFAR the training curve was also significantly different between Torch and Caffe. The Torch ResNet models provided by Facebook were also slightly more accurate than the original Caffe ResNet models.

We hope this difference on ImageNet accuracy won't affect much when fine-tuning on other tasks. We'll update the models if we get more accurate results.

shicai commented 7 years ago

@liuzhuang13 thank you for your quick reply. I just manually converted the torch model into caffe format. due to some unknown reasons, the acc is about 0.2~0.3% lower than the original torch model. i will check it tomorrow.

liuzhuang13 commented 7 years ago

@shicai Thanks! Could you share your converted caffe models if convenient?

Tongcheng commented 7 years ago

@liuzhuang13 @shicai I think there might be differences in the EMA procedure of BN between torch and BN. The default cudnn-torch BN have momentum parameter = 0.1 for EMA of BN, inside cudnn api, this means new_global[Stat] = 0.1batch[Stat]+0.9old_global[Stat], where Stat can be Mean/Var. But caffe do this differently, caffe's global[stat] holds value as batch_Stat[0]+moving_average_fraction batch_Stat[-1]+moving_average_fraction^2 batch_Stat[-2]+... in the training of BN, and in the inference, it divide the global_Stat by 1+moving_average_fraction+moving_average_fraction^2+... to get inference Mean/Var. And default moving_average_fraction in caffe is 0.999.

shicai commented 7 years ago

@Tongcheng thanks for sharing. @liuzhuang13 I made my repo public, see: https://github.com/shicai/DenseNet-Caffe everyone can download these caffemodels freely. hope it helps.

liuzhuang13 commented 7 years ago

@shicai Thanks for sharing, we have added the link on our readme page. Hope more people start using them:)

asfix commented 6 years ago

Your repository contains no train val prototxts. On the other hand, please check the end of your files. They all end with a convolutional layer. Is this right?

FaezeMM commented 5 years ago

I'm trying to fine-tune my dataset using DenseNet models (available here) and Nvidia-Digits system I've already read all the issues and make some modification to my custom network, but it gives me the following error:

conv2_1/x2/bn needs backward computation.
conv2_1/x1 needs backward computation.
relu2_1/x1 needs backward computation.
conv2_1/x1/scale needs backward computation.
conv2_1/x1/bn needs backward computation.
pool1_pool1_0_split needs backward computation.
pool1 needs backward computation.
relu1 needs backward computation.
conv1/scale needs backward computation.
conv1/bn needs backward computation.
conv1 needs backward computation.
label_val-data_1_split does not need backward computation.
val-data does not need backward computation.
This network produces output accuracy
This network produces output loss
Network initialization done.
Solver scaffolding done.
Finetuning from /home/ubuntu/models/DenseNet-Caffe/densenet201.caffemodel
Ignoring source layer input
Check failed: target_blobs.size() == source_layer.blobs_size() (5 vs. 3) Incompatible number of blobs for layer conv1/bn

here is my network, I used the original prototxt and made some modification as bellow

layer {
 name: "train-data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   stage: "train"
 }
 transform_param {
   crop_size: 224
 }
 data_param {
   batch_size: 126
 }
}
layer {
 name: "val-data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   stage: "val"
 }
 transform_param {
   crop_size: 224
 }
 data_param {
   batch_size: 64
 }
}
layer {
 name: "conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 convolution_param {
   num_output: 64
   bias_term: false
   pad: 3
   kernel_size: 7
   stride: 2
 }
}
layer {
 name: "conv1/bn"
 type: "BatchNorm"
 bottom: "conv1"
 top: "conv1/bn"
 batch_norm_param {
   eps: 1e-5
 }
}
layer {
 name: "conv1/scale"
 type: "Scale"
 bottom: "conv1/bn"
 top: "conv1/bn"
 scale_param {
   bias_term: true
 }
}
layer {
 name: "relu1"
 type: "ReLU"
 bottom: "conv1/bn"
 top: "conv1/bn"
}
.
.
.
.
layer {
 name: "fc6new"
 type: "Convolution"
 bottom: "pool5"
 top: "fc6new"
 convolution_param {
   num_output: 35
   kernel_size: 1
 }
}

layer {
 name: "loss"
 type: "SoftmaxWithLoss"
 bottom: "fc6new"
 bottom: "label"
 top: "loss"
 exclude {
   stage: "deploy"
 }
}
layer {
 name: "accuracy"
 type: "Accuracy"
 bottom: "fc6new"
 bottom: "label"
 top: "accuracy"
 include {
   stage: "val"
 }
}
layer {
 name: "accuracy_train"
 type: "Accuracy"
 bottom: "fc6new"
 bottom: "label"
 top: "accuracy_train"
 include {
   stage: "train"
 }
 accuracy_param {
   top_k: 5
 }
}
layer {
 name: "softmax"
 type: "Softmax"
 bottom: "fc6new"
 top: "softmax"
 include {
   stage: "deploy"
 }
}