BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.13k stars 18.68k forks source link

One question about BaseConvolutionLayer<Dtype>::backward_cpu_gemm function #5812

Open moondaiy opened 7 years ago

moondaiy commented 7 years ago

Hi everyone:

I am researching source code of caffe about BaseConvolutionLayer<Dtype>::backward_cpu_gemm function. I found that the BP algorithm of convolution, in the book, need to rotate the weight 180 degree, and then making top_diff * weight(after 180 degree) can get the bottom_diff.

But in caffe code . i found that, in BaseConvolutionLayer<Dtype>::backward_cpu_gemm function, we first make weight do transpose ,getting transposed weight and then do caffe_cpu_gemm to get bottom_diff.

I do not know , why there are difference ......Thanks for your helping....

this is code

template <typename Dtype>
void BaseConvolutionLayer<Dtype>::backward_cpu_gemm(const Dtype* output,
    const Dtype* weights, Dtype* input) {
  Dtype* col_buff = col_buffer_.mutable_cpu_data();
  if (is_1x1_) {
    col_buff = input;
  }
  for (int g = 0; g < group_; ++g) {
    caffe_cpu_gemm<Dtype>(CblasTrans, CblasNoTrans, kernel_dim_,
        conv_out_spatial_dim_, conv_out_channels_ / group_,
        (Dtype)1., weights + weight_offset_ * g, output + output_offset_ * g,
        (Dtype)0., col_buff + col_offset_ * g);
  }
  if (!is_1x1_) {
    conv_col2im_cpu(col_buff, input);
  }
}
giuseros commented 5 years ago

Hello @moondaiy , I have exactly the same question. Were you able to understand more?