yuenshome / yuenshome.github.io

https://yuenshome.github.io
MIT License
83 stars 15 forks source link

mobilenetv1、mobilenetv2、shufflenetv1、shufflenetv2 #21

Open ysh329 opened 5 years ago

ysh329 commented 5 years ago

is it faster and less calculation by the mobilenet on datknet ? · Issue #1 · zunzhumu/darknet-mobilenet https://github.com/zunzhumu/darknet-mobilenet/issues/1 为什么CPU的速度能那么快? · Issue #22 · Zehaos/MobileNet https://github.com/Zehaos/MobileNet/issues/22 add ConvolutionDepthwise layer by sp2823 · Pull Request #5665 · BVLC/caffe https://github.com/BVLC/caffe/pull/5665 yonghenglh6/DepthwiseConvolution: A personal depthwise convolution layer implementation on caffe by liuhao.(only GPU) https://github.com/yonghenglh6/DepthwiseConvolution PaulChongPeng/darknet: Convolutional Neural Networks https://github.com/PaulChongPeng/darknet


ysh329 commented 5 years ago

MobileNetV1

Depthwise(DW)卷积与Pointwise(PW)卷积,合起来被称作Depthwise Separable Convolution(参见Google的Xception),该结构和常规卷积操作类似,可用来提取特征,但相比于常规卷积操作,其参数量和运算成本较低。所以在一些轻量级网络中会碰到这种结构如MobileNet。

1. 常规卷积操作

对于一张5×5像素、三通道彩色输入图片(shape为5×5×3)。经过3×3卷积核的卷积层(假设输出通道数为4,则卷积核shape为3×3×3×4),最终输出4个Feature Map,如果有same padding则尺寸与输入层相同(5×5),如果没有则为尺寸变为3×3。

image

2. Depthwise Separable Convolution

Depthwise Separable Convolution是将一个完整的卷积运算分解为两步进行,即Depthwise Convolution与Pointwise Convolution。

2.1 Depthwise Convolution

不同于常规卷积操作,Depthwise Convolution的一个卷积核负责一个通道,一个通道只被一个卷积核卷积。上面所提到的常规卷积每个卷积核是同时操作输入图片的每个通道。

同样是对于一张5×5像素、三通道彩色输入图片(shape为5×5×3),Depthwise Convolution首先经过第一次卷积运算,不同于上面的常规卷积,DW完全是在二维平面内进行。卷积核的数量与上一层的通道数相同(通道和卷积核一一对应)。所以一个三通道的图像经过运算后生成了3个Feature map(如果有same padding则尺寸与输入层相同为5×5),如下图所示。

image

Depthwise Convolution完成后的Feature map数量与输入层的通道数相同,无法扩展Feature map。而且这种运算对输入层的每个通道独立进行卷积运算,没有有效的利用不同通道在相同空间位置上的feature信息。因此需要Pointwise Convolution来将这些Feature map进行组合生成新的Feature map。

2.2 Pointwise Convolution

Pointwise Convolution的运算与常规卷积运算非常相似,它的卷积核的尺寸为 1×1×M,M为上一层的通道数。所以这里的卷积运算会将上一步的map在深度方向上进行加权组合,生成新的Feature map。有几个卷积核就有几个输出Feature map。如下图所示。

image

3. Depthwise Separable Convolution

Depthwise Separable Convolution是将一个完整的卷积运算分解为两步进行,即Depthwise Convolution与Pointwise Convolution。

3.1 计算量

Loosely speaking, assume that we have a W*H image with depth D at each input location. For each location, we get a K*K patch, which could be considered as a K*K*D vector, and apply M filters to it. In pseudocode, this is (ignoring boundary conditions):

for w in 1..W
  for h in 1..H
    for x in 1..K
      for y in 1..K
        for m in 1..M
          for d in 1..D
            output(w, h, m) += input(w+x, h+y, d) * filter(m, x, y, d)
          end
        end
      end
    end
  end
end

image

3.2 网络结构

image


ysh329 commented 5 years ago

MobileNetV2

ysh329 commented 5 years ago

ShuffleNetV1

image

代码实现

template <typename Dtype>
void ShuffleChannelLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
                                             const vector<Blob<Dtype>*>& top) {
    const Dtype* bottom_data = bottom[0]->cpu_data();
    Dtype* top_data = top[0]->mutable_cpu_data();

    const int num = bottom[0]->shape(0);
    const int feature_map_size = bottom[0]->count(1);
    const int sp_sz = bottom[0]->count(2);
    const int chs = bottom[0]->shape(1);

    int group_row = group_;
    int group_column = int(chs / group_row);
    CHECK_EQ(chs, (group_column * group_row)) << "Wrong group size.";

    //Dtype* temp_data = temp_blob_.mutable_cpu_data();
    for(int n = 0; n < num; ++n)
    {
        Resize_cpu(top_data + n*feature_map_size, bottom_data + n*feature_map_size, group_row, group_column, sp_sz);
    }
    //caffe_copy(bottom[0]->count(), temp_blob_.cpu_data(), top_data);
}

ShuffleNet/shuffle_channel_layer.cpp at master · farmingyard/ShuffleNet https://github.com/farmingyard/ShuffleNet/blob/master/shuffle_channel_layer.cpp#L43-L64

ysh329 commented 5 years ago

ShuffleNetV2