Open soeaver opened 7 years ago
Thanks for your interests! We'll work on it as soon as possible. Hopefully the model will come out this year.
Is imagenet caffe model available yet?
Thanks,
Sorry for the late response. Our pretrained DenseNet-121 and prototxt in Caffe has just been released here https://github.com/liuzhuang13/DenseNet#imagenet-and-pretrained-models
Thanks for your interest!
@liuzhuang13 I just tested the pretrained DenseNet-121 caffe model from Zhiqiang Shen, and the top-1/5 acc by using single center crop is only 70.8/90.3%(256xN) or 72.5/91.3%(256x256). is there something wrong? it should be 75.0/92.3%, right?
@shicai Thanks for using our models! Yes, we are aware of this issue. Your 72.5%/91.3% accuracy (256x256) is the same as what we get. The difference may come from a different data augmentation scheme and other implementation difference between fb.resnet.torch and Caffe, despite we have tried to keep other parameters as consistent as possible. On CIFAR the training curve was also significantly different between Torch and Caffe. The Torch ResNet models provided by Facebook were also slightly more accurate than the original Caffe ResNet models.
We hope this difference on ImageNet accuracy won't affect much when fine-tuning on other tasks. We'll update the models if we get more accurate results.
@liuzhuang13 thank you for your quick reply. I just manually converted the torch model into caffe format. due to some unknown reasons, the acc is about 0.2~0.3% lower than the original torch model. i will check it tomorrow.
@shicai Thanks! Could you share your converted caffe models if convenient?
@liuzhuang13 @shicai I think there might be differences in the EMA procedure of BN between torch and BN. The default cudnn-torch BN have momentum parameter = 0.1 for EMA of BN, inside cudnn api, this means new_global[Stat] = 0.1batch[Stat]+0.9old_global[Stat], where Stat can be Mean/Var. But caffe do this differently, caffe's global[stat] holds value as batch_Stat[0]+moving_average_fraction batch_Stat[-1]+moving_average_fraction^2 batch_Stat[-2]+... in the training of BN, and in the inference, it divide the global_Stat by 1+moving_average_fraction+moving_average_fraction^2+... to get inference Mean/Var. And default moving_average_fraction in caffe is 0.999.
@Tongcheng thanks for sharing. @liuzhuang13 I made my repo public, see: https://github.com/shicai/DenseNet-Caffe everyone can download these caffemodels freely. hope it helps.
@shicai Thanks for sharing, we have added the link on our readme page. Hope more people start using them:)
Your repository contains no train val prototxts. On the other hand, please check the end of your files. They all end with a convolutional layer. Is this right?
I'm trying to fine-tune my dataset using DenseNet models (available here) and Nvidia-Digits system I've already read all the issues and make some modification to my custom network, but it gives me the following error:
conv2_1/x2/bn needs backward computation.
conv2_1/x1 needs backward computation.
relu2_1/x1 needs backward computation.
conv2_1/x1/scale needs backward computation.
conv2_1/x1/bn needs backward computation.
pool1_pool1_0_split needs backward computation.
pool1 needs backward computation.
relu1 needs backward computation.
conv1/scale needs backward computation.
conv1/bn needs backward computation.
conv1 needs backward computation.
label_val-data_1_split does not need backward computation.
val-data does not need backward computation.
This network produces output accuracy
This network produces output loss
Network initialization done.
Solver scaffolding done.
Finetuning from /home/ubuntu/models/DenseNet-Caffe/densenet201.caffemodel
Ignoring source layer input
Check failed: target_blobs.size() == source_layer.blobs_size() (5 vs. 3) Incompatible number of blobs for layer conv1/bn
here is my network, I used the original prototxt and made some modification as bellow
layer {
name: "train-data"
type: "Data"
top: "data"
top: "label"
include {
stage: "train"
}
transform_param {
crop_size: 224
}
data_param {
batch_size: 126
}
}
layer {
name: "val-data"
type: "Data"
top: "data"
top: "label"
include {
stage: "val"
}
transform_param {
crop_size: 224
}
data_param {
batch_size: 64
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 64
bias_term: false
pad: 3
kernel_size: 7
stride: 2
}
}
layer {
name: "conv1/bn"
type: "BatchNorm"
bottom: "conv1"
top: "conv1/bn"
batch_norm_param {
eps: 1e-5
}
}
layer {
name: "conv1/scale"
type: "Scale"
bottom: "conv1/bn"
top: "conv1/bn"
scale_param {
bias_term: true
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1/bn"
top: "conv1/bn"
}
.
.
.
.
layer {
name: "fc6new"
type: "Convolution"
bottom: "pool5"
top: "fc6new"
convolution_param {
num_output: 35
kernel_size: 1
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc6new"
bottom: "label"
top: "loss"
exclude {
stage: "deploy"
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc6new"
bottom: "label"
top: "accuracy"
include {
stage: "val"
}
}
layer {
name: "accuracy_train"
type: "Accuracy"
bottom: "fc6new"
bottom: "label"
top: "accuracy_train"
include {
stage: "train"
}
accuracy_param {
top_k: 5
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "fc6new"
top: "softmax"
include {
stage: "deploy"
}
}
Will you release the pre-train models on Caffe