ysh329 / MobilenetSSDFace

Caffe implementation of Mobilenet-SSD face detector (NCS compatible)
0 stars 0 forks source link

New layers of mobilenet-ssd #1

Open ysh329 opened 5 years ago

ysh329 commented 5 years ago

MobileNet-SSD移植

要找个MobileNet-SSD的Caffe模型移植到内部框架(内部框架支持Caffe模型转换到内部框架的特殊格式),好不容易找到这个项目:BeloborodovDS/MobilenetSSDFace。本文对该项目进行分析:

  1. 先跑项目例子
  2. MobileNet-SSD引入的新层
  3. MobileNet-SSD新层的实现

1. 先跑项目例子

在caffe-ssd的docker容器中跑的,将BeloborodovDS/MobilenetSSDFace项目的代码克隆下来,发现该项目的scripts/test_on_examples.py,我对其修改,在项目代码根目录下执行以下代码,就可以跑图片的人脸关键点并画到图片上输出结果并保存:

mkdir images/output
python ./scripts/test_on_examples.py

2. MobileNet-SSD引入的新层

我这里不仅对其prototxt的文件进行字符串的过滤,另外通过netscope工具观察了其结构,见ssd_face_deploy_bn netscope,可以看到有3个输出:mbox_conf_flattenmbox_locmbox_priorbox,分别对应检测框内物体的类别概率、检测框信息(location)、先验检测框,这三部分。

prototxt_path = "./ssd_face_deploy_bn.prototxt"
with open(prototxt_path) as proto_handle:
    prototxt_lines = proto_handle.readlines()
type_list = filter(lambda line: "type" in line, prototxt_lines)
type_list = set(type_list)
{'      type: "constant"\n',
 '      type: "msra"\n',
 '    code_type: CENTER_SIZE\n',
 '  type: "BatchNorm"\n',
 '  type: "Concat"\n',
 '  type: "Convolution"\n',
 '  type: "DetectionOutput"\n',
 '  type: "Flatten"\n',
 '  type: "Permute"\n',
 '  type: "PriorBox"\n',
 '  type: "ReLU"\n',
 '  type: "Reshape"\n',
 '  type: "Scale"\n',
 '  type: "Softmax"\n'}

新的层:

3. MobileNet-SSD新层的实现

ysh329 commented 5 years ago
#include <assert.h>
#include <stdio.h>
#include <memory.h>
#include <stdlib.h>

permute

#define INPUT_SHAPE_NUM (4) // input axes number
void permute_helper(const int *input_shape, const int *input_shape_swap_index, int *input_steps, int *permuted_input_steps)
{
    assert(input_shape && input_shape_swap_index);

    for(int axe1_idx = 0; axe1_idx < INPUT_SHAPE_NUM; ++axe1_idx)
    {
        int input_lens = 1;
        int permuted_input_lens = 1;
        for(int axe2_idx = 0; axe2_idx < INPUT_SHAPE_NUM; ++axe2_idx)
        {
            input_lens *= input_shape[axe2_idx];
            int swap_axe_idx = input_shape_swap_index[axe2_idx]
            permuted_input_lens *= input_shape[swap_axe_idx];
        }
        input_steps[axe1_idx] = input_lens;
        permuted_input_lens[axe1_idx] = permuted_input_lens;
    }
    return;
}

void permute(float *input, const int *input_shape, const int *input_shape_swap_index)
{
    assert(input && input_shape && input_shape_swap_index);

    int input_num = 1;
    for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
    {
        input_num *= input_shape[axe_idx];
    }
    float *permuted_input = calloc(input_num, sizeof float);
    memcpy(permuted_input, input, input_num * sizeof float);

    int input_steps[INPUT_SHAPE_NUM] = {0};
    int permuted_input_steps[INPUT_SHAPE_NUM] = {0};
    permute_helper(input_shape, input_shape_swap_index, input_steps, permuted_input_steps);

    for(int pidx = 0; pidx < input_num; ++pidx)
    {
        int input_idx = 0;
        int permuted_idx = pidx;
        for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
        {
            int swap_axe_idx = input_shape_swap_index[axe_idx];
            input_idx += (permuted_idx / permuted_input_steps[axe_idx]) * input_steps[swap_axe_idx];
            permute_idx %= permuted_input_lens[axe_idx];
        }
        input[pidx] = permuted_input[input_idx];
    }

    if(permuted_input) free(permuted_input);
    permuted_input = NULL;
    return;
}

flatten

void flatten(const float *input, const int *input_shape, float *output)
{
    int input_num = 1;
    for(int axe_idx = 0; axe_idx < INPUT_SHAPE_NUM; ++axe_idx)
    {
        input_num *= input_shape[axe_idx];
    }
    memcpy(input, output, input_num * sizeof float);
    return;
}

detection_output

#define MAX_STR_LEN (100)
void detection_output(float *mbox_conf, int mbox_conf_num, float *mbox_loc, int mbox_loc_num, char *anchor_file_path)
{
    //char anchor_file_path[MAX_STR_LEN];
    strcpy(anchor);
    return;
}
ysh329 commented 5 years ago

三个分支结果都能拿到且正确。

  1. ProirBox分支,通过跑Caffe得到所有的anchor,生成了anchor.txt这个文件;
  2. mbox_loc分支和mbox_conf_flatten分支,通过get_result,拿到结果;

目前下一步是,

  1. 加入后处理即detectionOutput层,将mbox_loc和mbox_conf_flatten的结果结合anchors生成检测框的结果;
  2. 结合已有项目,摄像头可视化检测结果。