BatchNorm with CUDNN doesn't take effect, so there are no scale factors in BatchNorm layers?

Tongcheng / DN_CaffeScript

128 stars 58 forks source link

BatchNorm with CUDNN doesn't take effect, so there are no scale factors in BatchNorm layers? #3

Open WenzhMicrosoft opened 6 years ago

WenzhMicrosoft commented 6 years ago

I found that CuDNNBatchNormLayer in your caffe branch ( https://github.com/Tongcheng/caffe ) is not registered as a Creator in layer_factory.cpp, But your BatchNorm Registered with REGISTER_LAYER_CLASS(BatchNorm);

An registered CuDNNBatchNormLayer seems like this: https://github.com/BVLC/caffe/commit/c9eda39851730ebb793f1f528c160fb70dc3f6fe#diff-6fe0622356ab61c001bcac36dd571e7d

So I guess the following setting wouldn't take affect, and there is no scale factors, and we need to add scale layers after each "BatchNorm" layer

    scale_filler {
      type: "constant"
      value: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    engine: CUDNN

Tongcheng commented 6 years ago

Hello @WenzhMicrosoft , I am not sure what is the question we are having, but my understanding is that during layer initialization, it only reads the configuration from NeuralNetwork's proto based on the caffe.proto. Therefore I think layer_factory should not matter.

WenzhMicrosoft commented 6 years ago

Hi @Tongcheng, You specified "engine: CUDNN" in prototxt, I think the reason you specified this is because you want to use a CuDNNBatchNormLayer(Do you?), and it makes more sense if you use CuDNNBatchNormLayer because CuDNNBatchNormLayer actually is combination of BatchNormLayer and ScaleLayer, normally Caffe version of Neural Network need this ScaleLayer layer to implement the γ and β factor metioned in Batch normalization paper

This is an example: https://github.com/KaimingHe/deep-residual-networks/blob/master/prototxt/ResNet-50-deploy.prototxt

As what I said before. Caffe actually creates a BatchNormLayer without γ and β. Is this what you want?