A doubt about scale layer's backward (may be bug)

Issue summary

I read scale layer code recently. I found some suspicious code. Here is it:

template <typename Dtype>
void ScaleLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
   ...
  else {
        const Dtype* sum_mult = sum_multiplier_.cpu_data();
        sum_result = (outer_dim_ == 1) ?
            scale->mutable_cpu_diff() : sum_result_.mutable_cpu_data();
        caffe_cpu_gemv(CblasNoTrans, sum_result_.count(), inner_dim_,
                       Dtype(1), product, sum_mult, Dtype(0), sum_result);
      }
      if (outer_dim_ != 1) {
      ...
}

In the above code, if outer_dim == 1, then scale_diff will be replaced instead of adding by caffe_cpu_gemv's result since BETA param of gemv is zero. This seems wrong. The same happens in gpu version.

Steps to reproduce

Just code review.

Tried solutions

System configuration

Operating system:
Compiler:
CUDA version (if applicable):
CUDNN version (if applicable):
BLAS:
Python version (if using pycaffe):
MATLAB version (if using matcaffe):

Issue checklist

[ ] read the guidelines and removed the first paragraph
[ ] written a short summary and detailed steps to reproduce
[ ] explained how solutions to related problems failed (tick if found none)
[ ] filled system configuration
[ ] attached relevant logs/config files (tick if not applicable)

BVLC / caffe

A doubt about scale layer's backward (may be bug) #6604

Issue summary

Steps to reproduce

Tried solutions

System configuration

Issue checklist