soeaver / caffe-model

Caffe models (including classification, detection and segmentation) and deploy files for famouse networks
MIT License
1.28k stars 624 forks source link

Finetuning 错误 #19

Closed junedgar closed 7 years ago

junedgar commented 7 years ago

您好!感谢您将这些模型公开! 我在使用您的百度云模型进行Finetuning的时候,不管是inceptionV4或者inceptionResnetV2总会遇到下面这个问题: Check failed: target_blobs.size() == source_layer.blobs_size() (2 vs. 1) Incompatible number of blobs for layer conv1_3x3_s2 不知道您这边能否给我点建议解决这个问题

shicai commented 7 years ago

注意conv层的bias_term,没设置的时候默认是true的,模型里是要求为false的。你不设置bias_term: false就会出现这个错误。

junedgar commented 7 years ago

@shicai 谢谢您的建议 不过我在卷积层中加入了bias_term:false , 如下,但是出现了新的错误信息

convolution_param {
 num_output: 32
 pad: 0
 kernel_size: 3
 stride: 2
 bias_term: false
 weight_filler {
        type: "xavier"
        std: 0.01
}
 bias_filler {
 type: "constant"
 value: 0.2
 }
I0626 14:45:46.157280 65202 net.cpp:157] Top shape: 64 32 149 149 (45467648)
I0626 14:45:46.157291 65202 net.cpp:165] Memory required for data: 250530816
F0626 14:45:46.157311 65202 net.cpp:169] Check failed: param_size <= num_param_blobs (2 vs. 1) Too many params specified for layer conv1_3x3_s2

我google了一下又说要把bias_term去掉。链接地址

junedgar commented 7 years ago

@shicai 当我根据链接地址这个链接建议把param参数注释掉,如下,然后就又报原来的错误了

layer {
  name: "conv1_3x3_s2"
  type: "Convolution"
  bottom: "data"
  top: "conv1_3x3_s2"
  #param {
  #  lr_mult: 1
  #  decay_mult: 1
  #}
  #param {
  #  lr_mult: 2
  #  decay_mult: 0
  #}
  convolution_param {
    num_output: 32
    pad: 0
    kernel_size: 3
    stride: 2
    bias_term: false
    weight_filler {
      type: "xavier"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}

Check failed: target_blobs.size() == source_layer.blobs_size() (2 vs. 1) Incompatible number of blobs for layer conv2_3x3_s1

soeaver commented 7 years ago

@junedgar 你虽然设置了bias_term: false, 但是却设置了bias的初始化:bias_filler ,把这部分也去掉。 还有模型在百度云下载的时候就有deploy.prototxt,请一定要参照这个配置文件修改,不要参照github里的,否则极有可能报错或者无法复现精度

shicai commented 7 years ago

正确的写法是这个,你只有weight,没有bias,所以param只需要设置一组,没有bias,自然也就没有bias_filler。

layer {
  name: "conv1_3x3_s2"
  type: "Convolution"
  bottom: "data"
  top: "conv1_3x3_s2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  convolution_param {
    num_output: 32
    pad: 0
    kernel_size: 3
    stride: 2
    bias_term: false
    weight_filler {
      type: "xavier"
    }
  }
}
junedgar commented 7 years ago

@soeaver 好的 谢谢指教

junedgar commented 7 years ago

@shicai 我按照你的改法 还是出现相同的错误,谢谢你的回答,我这边再研究看看 Attempting to upgrade batch norm layers using deprecated params: inceptionV4/inception_v4.caffemodel I0626 15:52:19.389828 76449 upgrade_proto.cpp:80] Successfully upgraded batch norm layers using deprecated params. I0626 15:52:19.389855 76449 net.cpp:761] Ignoring source layer input F0626 15:52:19.389892 76449 net.cpp:767] Check failed: target_blobs.size() == source_layer.blobs_size() (2 vs. 1) Incompatible number of blobs for layer conv2_3x3_s1

shicai commented 7 years ago

刚才是conv1_3x3_s2,这次还是同一个错误,不过是conv2_3x3_s1,有区别的,你没注意到吗?

junedgar commented 7 years ago

@shicai 看到了,不好意思,粗心啦!给你添麻烦啦 !谢谢你的帮助!

xizi commented 7 years ago

@junedgar 你能把修正后的train_val.prototxt和solver.prototxt发我一份嘛,我刚刚也遇到了这个问题,谢谢~😁

junedgar commented 7 years ago

@xizi 白天没空 晚上会找个时间修改一下 如果能跑起来 我就给你邮箱发一份

xizi commented 7 years ago

@junedgar Thank you~, my e-mail:shengxili.xidian@gmail.com

xizi commented 7 years ago

@soeaver 你能把修正后的train_val.prototxt和solver.prototxt发我一份嘛,我刚刚也遇到了这个问题,谢谢~😁

junedgar commented 7 years ago

@soeaver 我对了一下train_val以及deploy的网络层,发现网络层好像有点不太一样,如图,导致后面预训练模型跟训练网络的数据总是对不上(inceptionV4)

8c3893ed-4ae5-4205-9749-d7b79a13f498

Cannot copy param 0 weights from layer 'inception_b1_1x7_2'; shape mismatch. Source param shape is 224 192 1 7 (301056); target param shape is 192 192 1 7 (258048). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

soeaver commented 7 years ago

@junedgar 还有模型在百度云下载的时候就有deploy.prototxt,请一定要参照这个配置文件修改,不要参照github里的,否则极有可能报错或者无法复现精度

matakk commented 7 years ago

@junedgar @xizi @shicai 大家好。我也尝试修改参数微调 inception v3,但是遇到了

F0815 13:22:46.454238 7220 net.cpp:767] Check failed: target_blobs.size() == source_layer.blobs_size() (0 vs. 2) Incomp atible number of blobs for layer conv4_3x3_reduce_scale Check failure stack trace:

一直改到这个地方,不知道参数了。 对模型我还不熟悉。请问哪里有完整的训练和部署和prototxt吗? 能把你们训练的也分享一份吗? 谢谢!!

junedgar commented 7 years ago

@matakk 我用的是InceptionV2-ResNet,你可以参考这个code.如何把train_val.prototxt改成deploy.prototxt可以参考文章 inception-v3/4都是一样的修改原理