Open nattari opened 7 years ago
you write a wrong layer type.
layer {
name: "loss"
type: "OrdinalRegressionLoss"
ordinal_regression_loss_param {
k: 4
}
bottom: "fc8bisi"
bottom: "label"
top: "loss"
}
Ah, thanks for the correction. I am able to compile Caffe by adding layer files provided here and ran the test successfully. Now to use the code, it is not quite straightforward and there are pre-processing steps required as I read in the paper. We need to prepare input as follows, D(K) = {x(i), y(i,k), w(i,k) } where k=1,..K-1 Hence, if I have 3 classes (0,1,2) and then I would have 2 binary classifiers. If my class label for _i_th image say is '1', my input would be like D = {x(i), (0,1), (1,1)}. Is that correct?
Also, λ(t) is the importance of the 't' task, how do we define the importance? Is it the optional 'weight_file' parameter? Also, optional parameter 'k' is K-1?
I am really looking forward to use this code for my application. It would be great if you can share the sample code where each step from data preparation to training and testing can be tried directly.
Waiting for the reply.
There are two kind of weight. inter weight
represents the importance of every task and outer weight
balances the training samples of every task. In this layer implementation, inter weight
are all 1 (but you can easily add this weight) and outer weight
is from a text file.
Okay. That means inter_weight is λ that's 1 for all case and outer_weight is for class-balancing.
And as I mentioned earlier, how my train and test file should look like. Generally, in classification task it is like (filename, label). How would it be in this case?
something like below, this layer won't give the final output label (which always a label integer for every data) and can only be used in training. You should implement yourself to get the output label from output
blob.
layer{
name: "output"
type: "InnerProduct"
bottom: "feature"
top: "output"
inner_product_param {
num_output: 200
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "loss"
type: "OrdinalRegressionLoss"
bottom: "output"
bottom: "label"
top: "loss"
ordinal_regression_loss_param {
k: 100 # k should be last layer's num_output / 2, which is num_classes
}
}
hi i used this code for age estimation, but have a error: cant find layer "MAE" in the train_test.prototxt have a layer layer { bottom: "fc_output" bottom: "label" top: "mae" name: "mae" type: "MAE" }
can you tell the how to use this layer? thank you very much. hope you reply.
this layer is just for test the accuracy. remove this layer can still train the network.
The training details can be found in kongsicong/Age_recognition_OR.
you said that outer weight is from a text file.can you provide the weight file, my train model can't be convergence.
train:
name: "ResNet-18"
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
scale: 0.0078125
mean_value: 127.5
mean_value: 127.5
mean_value: 127.5
contrast_brightness_adjustment: true
min_contrast: 0.8
max_contrast: 1.2
max_color_shift: 20
smooth_filtering: true
max_smooth: 6
apply_probability: 0.5
}
image_data_param {
source: "/home/yusheng.zeng/data/train_age_gender/age/label/megaage_list_train.txt"
batch_size: 96
new_height: 64
new_width: 64
shuffle: true
root_folder: "/home/yusheng.zeng/data/train_age_gender/age/"
}
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
scale: 0.0078125
mean_value: 127.5
mean_value: 127.5
mean_value: 127.5
}
image_data_param {
source: "/home/yusheng.zeng/data/train_age_gender/age/label/megaage_list_test.txt"
batch_size: 32
new_height: 64
new_width: 64
shuffle: true
root_folder: "/home/yusheng.zeng/data/train_age_gender/age/"
}
}
layer {
bottom: "data"
top: "conv1_new"
name: "conv1_new"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 7
pad: 3
#stride: 2
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "conv1_new"
top: "conv1"
name: "bn_conv1"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "scale_conv1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "conv1_relu"
type: "ReLU"
}
layer {
bottom: "conv1"
top: "res2a_branch1_new"
name: "res2a_branch1_new"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 1
pad: 0
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2a_branch1_new"
top: "res2a_branch1"
name: "bn2a_branch1"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res2a_branch1"
top: "res2a_branch1"
name: "scale2a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "conv1"
top: "res2a_branch2a_new"
name: "res2a_branch2a_new"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2a_branch2a_new"
top: "res2a_branch2a"
name: "bn2a_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res2a_branch2a"
top: "res2a_branch2a"
name: "scale2a_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2a_branch2a"
top: "res2a_branch2a"
name: "res2a_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res2a_branch2a"
top: "res2a_branch2b"
name: "res2a_branch2b"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2a_branch2b"
top: "res2a_branch2b"
name: "bn2a_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res2a_branch2b"
top: "res2a_branch2b"
name: "scale2a_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2a_branch1"
bottom: "res2a_branch2b"
top: "res2a"
name: "res2a"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res2a"
top: "res2a"
name: "res2a_relu"
type: "ReLU"
}
layer {
bottom: "res2a"
top: "res2b_branch2a"
name: "res2b_branch2a"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "bn2b_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "scale2b_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "res2b_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2b"
name: "res2b_branch2b"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "bn2b_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "scale2b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2a"
bottom: "res2b_branch2b"
top: "res2b"
name: "res2b"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res2b"
top: "res2b"
name: "res2b_relu"
type: "ReLU"
}
layer {
bottom: "res2b"
top: "res3a_branch1"
name: "res3a_branch1"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 1
pad: 0
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res3a_branch1"
top: "res3a_branch1"
name: "bn3a_branch1"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res3a_branch1"
top: "res3a_branch1"
name: "scale3a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2b"
top: "res3a_branch2a"
name: "res3a_branch2a"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res3a_branch2a"
top: "res3a_branch2a"
name: "bn3a_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res3a_branch2a"
top: "res3a_branch2a"
name: "scale3a_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res3a_branch2a"
top: "res3a_branch2a"
name: "res3a_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res3a_branch2a"
top: "res3a_branch2b"
name: "res3a_branch2b"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res3a_branch2b"
top: "res3a_branch2b"
name: "bn3a_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res3a_branch2b"
top: "res3a_branch2b"
name: "scale3a_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res3a_branch1"
bottom: "res3a_branch2b"
top: "res3a"
name: "res3a"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res3a"
top: "res3a"
name: "res3a_relu"
type: "ReLU"
}
layer {
bottom: "res3a"
top: "res3b_branch2a"
name: "res3b_branch2a"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res3b_branch2a"
top: "res3b_branch2a"
name: "bn3b_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res3b_branch2a"
top: "res3b_branch2a"
name: "scale3b_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res3b_branch2a"
top: "res3b_branch2a"
name: "res3b_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res3b_branch2a"
top: "res3b_branch2b"
name: "res3b_branch2b"
type: "Convolution"
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res3b_branch2b"
top: "res3b_branch2b"
name: "bn3b_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res3b_branch2b"
top: "res3b_branch2b"
name: "scale3b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res3a"
bottom: "res3b_branch2b"
top: "res3b"
name: "res3b"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res3b"
top: "res3b"
name: "res3b_relu"
type: "ReLU"
}
layer {
bottom: "res3b"
top: "res4a_branch1"
name: "res4a_branch1"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 1
pad: 0
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res4a_branch1"
top: "res4a_branch1"
name: "bn4a_branch1"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res4a_branch1"
top: "res4a_branch1"
name: "scale4a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res3b"
top: "res4a_branch2a"
name: "res4a_branch2a"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res4a_branch2a"
top: "res4a_branch2a"
name: "bn4a_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res4a_branch2a"
top: "res4a_branch2a"
name: "scale4a_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res4a_branch2a"
top: "res4a_branch2a"
name: "res4a_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res4a_branch2a"
top: "res4a_branch2b"
name: "res4a_branch2b"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res4a_branch2b"
top: "res4a_branch2b"
name: "bn4a_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res4a_branch2b"
top: "res4a_branch2b"
name: "scale4a_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res4a_branch1"
bottom: "res4a_branch2b"
top: "res4a"
name: "res4a"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res4a"
top: "res4a"
name: "res4a_relu"
type: "ReLU"
}
layer {
bottom: "res4a"
top: "res4b_branch2a"
name: "res4b_branch2a"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res4b_branch2a"
top: "res4b_branch2a"
name: "bn4b_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res4b_branch2a"
top: "res4b_branch2a"
name: "scale4b_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res4b_branch2a"
top: "res4b_branch2a"
name: "res4b_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res4b_branch2a"
top: "res4b_branch2b"
name: "res4b_branch2b"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res4b_branch2b"
top: "res4b_branch2b"
name: "bn4b_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res4b_branch2b"
top: "res4b_branch2b"
name: "scale4b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res4a"
bottom: "res4b_branch2b"
top: "res4b"
name: "res4b"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res4b"
top: "res4b"
name: "res4b_relu"
type: "ReLU"
}
layer {
bottom: "res4b"
top: "res5a_branch1"
name: "res5a_branch1"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 1
pad: 0
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res5a_branch1"
top: "res5a_branch1"
name: "bn5a_branch1"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res5a_branch1"
top: "res5a_branch1"
name: "scale5a_branch1"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res4b"
top: "res5a_branch2a"
name: "res5a_branch2a"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 2
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res5a_branch2a"
top: "res5a_branch2a"
name: "bn5a_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res5a_branch2a"
top: "res5a_branch2a"
name: "scale5a_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res5a_branch2a"
top: "res5a_branch2a"
name: "res5a_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res5a_branch2a"
top: "res5a_branch2b"
name: "res5a_branch2b"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res5a_branch2b"
top: "res5a_branch2b"
name: "bn5a_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res5a_branch2b"
top: "res5a_branch2b"
name: "scale5a_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res5a_branch1"
bottom: "res5a_branch2b"
top: "res5a"
name: "res5a"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res5a"
top: "res5a"
name: "res5a_relu"
type: "ReLU"
}
layer {
bottom: "res5a"
top: "res5b_branch2a"
name: "res5b_branch2a"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res5b_branch2a"
top: "res5b_branch2a"
name: "bn5b_branch2a"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res5b_branch2a"
top: "res5b_branch2a"
name: "scale5b_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res5b_branch2a"
top: "res5b_branch2a"
name: "res5b_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res5b_branch2a"
top: "res5b_branch2b"
name: "res5b_branch2b"
type: "Convolution"
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res5b_branch2b"
top: "res5b_branch2b"
name: "bn5b_branch2b"
type: "BatchNorm"
batch_norm_param {
moving_average_fraction: 0.9
}
}
layer {
bottom: "res5b_branch2b"
top: "res5b_branch2b"
name: "scale5b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res5a"
bottom: "res5b_branch2b"
top: "res5b"
name: "res5b"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res5b"
top: "res5b"
name: "res5b_relu"
type: "ReLU"
}
layer {
bottom: "res5b"
top: "pool5"
name: "pool5"
type: "Pooling"
pooling_param {
kernel_size: 7
stride: 1
pool: AVE
}
}
layer {
bottom: "pool5"
top: "fc70"
name: "fc70"
type: "InnerProduct"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 1
}
inner_product_param {
num_output: 80 #0-69sui
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
bottom: "fc70"
top: "fc_output"
name: "fc_output"
type: "InnerProduct"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 1
}
inner_product_param {
num_output: 138 #0-69sui
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
bottom: "fc_output"
bottom: "label"
top: "mae"
name: "mae"
type: "MAE"
}
layer {
bottom: "fc_output"
bottom: "label"
top: "loss"
name: "loss"
type: "OrdinalRegressionLoss"
include {
phase: TRAIN
}
ordinal_regression_loss_param {
k:69
}
}
layer {
bottom: "fc_output"
bottom: "label"
top: "loss"
name: "loss"
type: "OrdinalRegressionLoss"
include {
phase: TEST
}
ordinal_regression_loss_param {
k:69
}
}
solver:
net: "train_res18_age_OR_refine.prototxt"
test_iter: 130
test_interval: 4000
base_lr: 0.001
display: 50
lr_policy: "poly"
max_iter: 200000
power: 1
momentum: 0.9
weight_decay: 0.0005
snapshot: 4000
snapshot_prefix: "/home/yusheng.zeng/work/dl_base/face_GA_train/model_age/G-res18-OR-refine"
test_initialization: false
Hi,
I have a task where my class types are ordinal and I am using the code provided here. But the performance after adding this param worsens, I wanted to know if my understanding of the params is correct i.e. k = no. of sub-tasks(classes) and weight_file = a text file with weights for the 'k' classes. Is it correct?
I am using this as : layer { name: "loss" type: "SoftmaxWithLoss" ordinal_regression_loss_param { k: 4 } bottom: "fc8bisi" bottom: "label" top: "loss" }
Any help would be highly appreciated.
Thanks in advance.