yuenshome / yuenshome.github.io

https://yuenshome.github.io
MIT License
84 stars 15 forks source link

ncnn benchmark #76

Open ysh329 opened 5 years ago

ysh329 commented 5 years ago

基本概况

NCNN项目首页的README写的非常清楚:

ncnn 是一个为手机端极致优化的高性能神经网络前向计算框架。ncnn 从设计之初深刻考虑手机端的部署和使用。无第三方依赖,跨平台,手机端 cpu 的速度快于目前所有已知的开源框架。基于 ncnn,开发者能够将深度学习算法轻松移植到手机端高效执行,开发出人工智能 APP,将 AI 带到你的指尖。ncnn 目前已在腾讯多款应用中使用,如 QQ,Qzone,微信,天天P图等。

介绍完之后便是支持的网络说明,这对于业务需要的用户,直接就能看到自己训练的模型是否可以部署,目前是否支持:

Classical CNN: VGG AlexNet GoogleNet Inception ...
Practical CNN: ResNet DenseNet SENet FPN ...
Light-weight CNN: SqueezeNet MobileNetV1/V2/V3 ShuffleNetV1/V2 MNasNet ...
Detection: MTCNN facedetection ...
Detection: VGG-SSD MobileNet-SSD SqueezeNet-SSD MobileNetV2-SSDLite ...
Detection: Faster-RCNN R-FCN ...
Detection: YOLOV2 YOLOV3 MobileNet-YOLOV3 ...
Segmentation: FCN PSPNet UNet ...

紧接着就是文档的条目,从如何编译开始(NV Jetson、X86、Win、Mac、树莓派、ARM A系列、Android、iOS、海思)平台,当然还提供了预编译的库供用户下载使用。后面是一些其他条目的罗列,如:

how to use ncnn with alexnet with detailed steps, recommended for beginners :)
ncnn 组件使用指北 alexnet 附带详细步骤,新人强烈推荐 :)
use netron for ncnn model visualization
ncnn low-level operation api
ncnn param and model file spec
ncnn operation param weight table
how to implement custom layer step by step

最后还有Example project,给出几个Android的使用例子,看起来可一键打包成APK。

编译

从release页面下载代码,代码所用版本:release-20190611,根据ncnn主页的readme轻松找到编译文档,很轻松的就完成了编译过程。但我太懒了,看到wiki里写的太多,不过发现根目录下有个build.shpackage.sh,就直接运行了build.sh,十分钟就完成了里面所有平台的交叉编译,对应所有平台的build文件夹,这个过程体验堪称完美。

此时我再打开build.sh可以看到有下面8个版本:

##### android armv7
##### android aarch64
##### android armv7 without neon
##### android armv7 vulkan
##### android aarch64 vulkan
##### linux of hisiv300 (forgot the chip name) toolchain with neon and openmp
##### linux of hisiv500 (Hi3516CV200 and Hi3519V101) toolchain with neon and openmp
##### linux of himix100 (Hi3559a) toolchain with neon and openmp

再来看package.sh,里面有4个版本,看来是每次用来发布打包的脚本:

##### package android lib
##### package ios framework
##### package android lib vulkan
##### package ios framework vulkan

这些基本的体验感觉做的很不错。

ysh329 commented 5 years ago

Benchmark

性能还是要测的,由于懒得翻文档,看到有个benchmark的文件夹,点进去就是readme,展示了benchmark的基本用法:简单提一下:/benchncnn [loop count] [num threads] [powersave] [gpu device],其中各项参数的表格说明如下:

param options default
loop count 1~N 4
num threads 1~N max_cpu_count
powersave 0=all cores, 1=little cores only, 2=big cores only 0
gpu device -1=cpu-only, 0=gpu0, 1=gpu1 ... -1

感觉用起来蛮简单的,为了一次能测试armv7、armv8、cpu、gpu,我写了如下内容的脚本:

#!/usr/bin/env bash

# init params
HOST_BUILD_PATH_LIST=("build-android-armv7" \
                      "build-android-aarch64" \
                      "build-android-armv7-vulkan" \
                      "build-android-aarch64-vulkan")
#HOST_BUILD_PATH_LIST=("build-android-aarch64-vulkan")

NCNN_DEVICE_PATH="/data/local/tmp/ncnn"
NCNN_LOOP_COUNT="4" #"10"

# cpu setting
NCNN_NUM_THREADS_LIST=("1" "2" "4")
NCNN_POWERSAVE="2"
NCNN_CPU_DEVICE="-1"

# gpu setting
NCNN_CPU_NUM_THREADS_ON_GPU_DEVICE="1"
NCNN_GPU_DEVICE="0"
NCNN_GPU_PATTERN="vulkan"

# push benchmark
adb shell "mkdir ${NCNN_DEVICE_PATH}"
adb push --sync ./benchmark/*.param ${NCNN_DEVICE_PATH}

# run benchmark
for HOST_BUILD_PATH in ${HOST_BUILD_PATH_LIST[@]}; do
    echo "---- HOST_BUILD_PATH:${HOST_BUILD_PATH} ----"
    adb push --sync ./${HOST_BUILD_PATH}/benchmark/benchncnn ${NCNN_DEVICE_PATH}

    if [[ ${HOST_BUILD_PATH} =~ ${NCNN_GPU_PATTERN} ]]
    then # do gpu: enable gpu
        NCNN_GPU_DEVICE="0"
        adb shell "cd ${NCNN_DEVICE_PATH}; ${NCNN_DEVICE_PATH}/benchncnn ${NCNN_LOOP_COUNT} ${NCNN_CPU_NUM_THREADS_ON_GPU_DEVICE} ${NCNN_POWERSAVE} ${NCNN_GPU_DEVICE}"
        #echo "adb shell cd ${NCNN_DEVICE_PATH}; ${NCNN_DEVICE_PATH}/benchncnn ${NCNN_LOOP_COUNT} ${NCNN_CPU_NUM_THREADS_ON_GPU_DEVICE} ${NCNN_POWERSAVE} ${NCNN_GPU_DEVICE}"
    else # do cpu
        for NCNN_NUM_THREADS in ${NCNN_NUM_THREADS_LIST[@]}; do
            adb shell "cd ${NCNN_DEVICE_PATH}; ${NCNN_DEVICE_PATH}/benchncnn ${NCNN_LOOP_COUNT} ${NCNN_NUM_THREADS} ${NCNN_POWERSAVE} ${NCNN_CPU_DEVICE}"
            #echo "adb shell cd ${NCNN_DEVICE_PATH}; ${NCNN_DEVICE_PATH}/benchncnn ${NCNN_LOOP_COUNT} ${NCNN_NUM_THREADS} ${NCNN_POWERSAVE} ${NCNN_CPU_DEVICE}"
            #echo "${NCNN_DEVICE_PATH}/benchncnn ${NCNN_LOOP_COUNT} ${NCNN_NUM_THREADS} ${NCNN_POWERSAVE} ${NCNN_CPU_DEVICE}"
        done
    fi
    echo
done
ysh329 commented 5 years ago

Benchmark结果

下面是该脚本的执行结果:

xiaomi mix2 snapdragon835:

./run_benchmark.sh
./run_benchmark.sh
---- HOST_BUILD_PATH:build-android-aarch64 ----
./build-android-aarch64/benchmark/benchncnn: 1 file pushed. 11.4 MB/s (3861624 bytes in 0.324s)
loop_count = 30
num_threads = 1
powersave = 2
gpu_device = -1
          squeezenet  min =   65.76  max =   67.04  avg =   66.34
     squeezenet_int8  min =   52.01  max =   52.99  avg =   52.35
           mobilenet  min =  111.35  max =  115.23  avg =  112.65
      mobilenet_int8  min =   95.67  max =   97.91  avg =   96.49
        mobilenet_v2  min =   73.64  max =   75.48  avg =   74.20
   mobilenet_v2_int8  min =   97.36  max =   98.75  avg =   97.83
          shufflenet  min =   32.88  max =   34.17  avg =   33.37
             mnasnet  min =   70.58  max =   71.68  avg =   71.20
     proxylessnasnet  min =   84.92  max =   86.23  avg =   85.59
           googlenet  min =  270.26  max =  273.59  avg =  272.03
      googlenet_int8  min =  200.24  max =  202.02  avg =  201.13
            resnet18  min =  220.86  max =  224.39  avg =  222.21
       resnet18_int8  min =  159.41  max =  163.78  avg =  160.78
             alexnet  min =  355.69  max =  358.26  avg =  356.64
               vgg16  min = 1320.49  max = 1328.02  avg = 1324.13
          vgg16_int8  min =  930.66  max =  935.90  avg =  933.28
            resnet50  min =  545.18  max =  557.93  avg =  551.81
       resnet50_int8  min =  408.54  max =  412.91  avg =  410.69
      squeezenet_ssd  min =  149.36  max =  152.09  avg =  150.82
 squeezenet_ssd_int8  min =  124.49  max =  126.80  avg =  125.79
       mobilenet_ssd  min =  223.43  max =  227.36  avg =  225.34
  mobilenet_ssd_int8  min =  186.45  max =  190.67  avg =  188.12
      mobilenet_yolo  min =  490.65  max =  509.05  avg =  498.56
    mobilenet_yolov3  min =  517.61  max =  532.40  avg =  525.21
loop_count = 30
num_threads = 2
powersave = 2
gpu_device = -1
          squeezenet  min =   37.20  max =   38.24  avg =   37.54
     squeezenet_int8  min =   30.35  max =   30.96  avg =   30.62
           mobilenet  min =   60.72  max =   62.21  avg =   61.41
      mobilenet_int8  min =   52.85  max =   53.44  avg =   53.15
        mobilenet_v2  min =   42.89  max =   43.54  avg =   43.20
   mobilenet_v2_int8  min =   54.77  max =   55.73  avg =   55.19
          shufflenet  min =   19.61  max =   19.97  avg =   19.84
             mnasnet  min =   40.29  max =   40.80  avg =   40.49
     proxylessnasnet  min =   46.80  max =   47.47  avg =   47.16
           googlenet  min =  147.91  max =  151.11  avg =  149.03
      googlenet_int8  min =  116.00  max =  119.52  avg =  117.68
            resnet18  min =  131.23  max =  137.46  avg =  134.50
       resnet18_int8  min =  102.33  max =  104.83  avg =  103.22
             alexnet  min =  227.82  max =  245.82  avg =  237.15
               vgg16  min =  705.58  max =  773.44  avg =  733.53
          vgg16_int8  min =  626.58  max =  629.30  avg =  628.02
            resnet50  min =  334.19  max =  342.53  avg =  337.66
       resnet50_int8  min =  263.17  max =  266.72  avg =  264.61
      squeezenet_ssd  min =   95.79  max =   99.03  avg =   96.92
 squeezenet_ssd_int8  min =   85.15  max =   86.29  avg =   85.58
       mobilenet_ssd  min =  135.69  max =  138.18  avg =  136.60
  mobilenet_ssd_int8  min =  115.89  max =  118.18  avg =  116.88
      mobilenet_yolo  min =  297.74  max =  306.35  avg =  302.59
    mobilenet_yolov3  min =  310.97  max =  317.83  avg =  314.48
loop_count = 30
num_threads = 4
powersave = 2
gpu_device = -1
          squeezenet  min =   25.22  max =   26.32  avg =   25.47
     squeezenet_int8  min =   20.33  max =   20.85  avg =   20.57
           mobilenet  min =   38.05  max =   38.96  avg =   38.43
      mobilenet_int8  min =   31.63  max =   32.24  avg =   31.96
        mobilenet_v2  min =   28.87  max =   29.65  avg =   29.24
   mobilenet_v2_int8  min =   32.88  max =   33.68  avg =   33.27
          shufflenet  min =   14.85  max =   15.23  avg =   14.98
             mnasnet  min =   26.34  max =   26.92  avg =   26.61
     proxylessnasnet  min =   29.90  max =   62.07  avg =   33.35
           googlenet  min =  166.45  max =  381.83  avg =  194.02
      googlenet_int8  min =  126.46  max =  326.80  avg =  144.00
            resnet18  min =  139.01  max =  360.16  avg =  162.16
       resnet18_int8  min =   96.23  max =  324.38  avg =  121.57
             alexnet  min =  226.31  max =  399.99  avg =  249.32
               vgg16  min =  779.21  max = 1008.77  avg =  861.87
          vgg16_int8  min =  539.21  max =  908.38  avg =  692.29
            resnet50  min =  341.18  max =  573.33  avg =  383.52
       resnet50_int8  min =  251.41  max =  458.00  avg =  293.22
      squeezenet_ssd  min =   97.21  max =  328.35  avg =  127.08
 squeezenet_ssd_int8  min =   84.17  max =  298.06  avg =  114.51
       mobilenet_ssd  min =  139.15  max =  342.19  avg =  161.25
  mobilenet_ssd_int8  min =  119.79  max =  323.15  avg =  129.42
      mobilenet_yolo  min =  304.88  max =  517.90  avg =  345.86
    mobilenet_yolov3  min =  317.80  max =  535.22  avg =  363.07

---- HOST_BUILD_PATH:build-android-armv7-vulkan ----
./build-android-armv7-vulkan/benchmark/benchncnn: 1 file pushed. 13.1 MB/s (5634628 bytes in 0.411s)
[0 Adreno (TM) 540]  queueC=0[3]  queueT=0[3]  memU=2  memDL=2  memHV=2
[0 Adreno (TM) 540]  fp16p=1  fp16s=0  fp16a=0  int8s=0  int8a=0
loop_count = 4
num_threads = 1
powersave = 2
gpu_device = 0
          squeezenet  min =   33.98  max =   34.27  avg =   34.09
           mobilenet  min =   51.07  max =   51.43  avg =   51.27
        mobilenet_v2  min =   37.17  max =   37.71  avg =   37.38
          shufflenet  min =   33.12  max =   33.65  avg =   33.41
             mnasnet  min =   37.71  max =   38.29  avg =   37.99
     proxylessnasnet  min =   41.29  max =   41.55  avg =   41.42
           googlenet  min =  135.20  max =  136.22  avg =  135.67
            resnet18  min =  141.61  max =  142.30  avg =  141.92
             alexnet  min =  262.48  max =  266.26  avg =  264.26
               vgg16  min =  905.66  max =  909.30  avg =  907.33
            resnet50  min =  313.37  max =  314.29  avg =  313.85
      squeezenet_ssd  min =  178.45  max =  181.51  avg =  179.99
       mobilenet_ssd  min =  122.03  max =  122.81  avg =  122.32
      mobilenet_yolo  min =  253.64  max =  254.25  avg =  253.99
    mobilenet_yolov3  min =  244.29  max =  245.39  avg =  244.60

---- HOST_BUILD_PATH:build-android-aarch64-vulkan ----
./build-android-aarch64-vulkan/benchmark/benchncnn: 1 file pushed. 12.8 MB/s (6982616 bytes in 0.520s)
[0 Adreno (TM) 540]  queueC=0[3]  queueT=0[3]  memU=2  memDL=2  memHV=2
[0 Adreno (TM) 540]  fp16p=1  fp16s=0  fp16a=0  int8s=0  int8a=0
loop_count = 4
num_threads = 1
powersave = 2
gpu_device = 0
          squeezenet  min =   34.16  max =   34.28  avg =   34.22
           mobilenet  min =   51.47  max =   51.84  avg =   51.60
        mobilenet_v2  min =   37.26  max =   37.46  avg =   37.38
          shufflenet  min =   32.85  max =   33.32  avg =   33.04
             mnasnet  min =   37.54  max =   37.76  avg =   37.64
     proxylessnasnet  min =   41.06  max =   41.32  avg =   41.25
           googlenet  min =  134.62  max =  135.20  avg =  134.93
            resnet18  min =  140.67  max =  141.28  avg =  140.93
             alexnet  min =  262.57  max =  266.50  avg =  264.16
               vgg16  min =  918.40  max =  921.52  avg =  919.58
            resnet50  min =  314.76  max =  315.97  avg =  315.33
      squeezenet_ssd  min =  177.70  max =  182.15  avg =  179.37
       mobilenet_ssd  min =  120.30  max =  121.66  avg =  121.24
      mobilenet_yolo  min =  256.96  max =  258.84  avg =  257.91
    mobilenet_yolov3  min =  282.06  max =  292.41  avg =  286.07
ysh329 commented 5 years ago

HUAWEI Mate10 Kirin970

ARMv7

---- HOST_BUILD_PATH:build-android-armv7 ----
./build-android-armv7/benchmark/benchncnn: 1 file pushed. 24.5 MB/s (2685992 bytes in 0.104s)
loop_count = 4
num_threads = 1
powersave = 2
gpu_device = -1
          squeezenet  min =   85.00  max =   89.16  avg =   86.83
     squeezenet_int8  min =   65.08  max =   65.82  avg =   65.50
           mobilenet  min =  134.76  max =  140.12  avg =  138.34
      mobilenet_int8  min =  121.17  max =  121.76  avg =  121.58
        mobilenet_v2  min =   90.89  max =   95.57  avg =   93.02
   mobilenet_v2_int8  min =  116.05  max =  116.62  avg =  116.40
          shufflenet  min =   37.11  max =   38.38  avg =   37.67
             mnasnet  min =   83.61  max =   88.78  avg =   86.95
     proxylessnasnet  min =   97.71  max =  101.46  avg =   98.84
           googlenet  min =  336.23  max =  347.73  avg =  341.21
      googlenet_int8  min =  246.59  max =  250.74  avg =  248.05
            resnet18  min =  319.57  max =  349.31  avg =  331.44
       resnet18_int8  min =  190.73  max =  196.32  avg =  192.71
             alexnet  min =  403.17  max =  405.28  avg =  404.37
               vgg16  min = 2230.21  max = 2287.31  avg = 2261.56
          vgg16_int8  min = 1329.28  max = 1350.18  avg = 1341.80
            resnet50  min =  727.94  max =  740.91  avg =  735.40
       resnet50_int8  min =  522.10  max =  534.12  avg =  527.09
      squeezenet_ssd  min =  240.87  max =  270.81  avg =  254.03
 squeezenet_ssd_int8  min =  161.62  max =  174.92  avg =  168.06
       mobilenet_ssd  min =  283.23  max =  287.21  avg =  286.02
  mobilenet_ssd_int8  min =  229.31  max =  241.69  avg =  234.10
      mobilenet_yolo  min =  638.52  max =  677.05  avg =  664.11
    mobilenet_yolov3  min =  679.72  max =  695.60  avg =  687.08
loop_count = 4
num_threads = 2
powersave = 2
gpu_device = -1
          squeezenet  min =   45.89  max =   48.57  avg =   47.14
     squeezenet_int8  min =   43.66  max =  133.35  avg =   72.47
           mobilenet  min =   75.25  max =   76.92  avg =   76.45
      mobilenet_int8  min =   63.57  max =   64.06  avg =   63.81
        mobilenet_v2  min =   49.08  max =   56.66  avg =   51.16
   mobilenet_v2_int8  min =   62.37  max =   62.86  avg =   62.62
          shufflenet  min =   22.23  max =   25.20  avg =   23.91
             mnasnet  min =   46.00  max =   53.25  avg =   48.71
     proxylessnasnet  min =   58.46  max =   63.96  avg =   61.66
           googlenet  min =  186.85  max =  258.86  avg =  209.43
      googlenet_int8  min =  133.67  max =  141.28  avg =  136.48
            resnet18  min =  164.81  max =  176.10  avg =  172.95
       resnet18_int8  min =  104.26  max =  122.90  avg =  111.89
             alexnet  min =  225.34  max =  225.56  avg =  225.46
               vgg16  min =  963.31  max =  985.44  avg =  976.28
          vgg16_int8  min =  737.81  max =  756.68  avg =  743.46
            resnet50  min =  383.86  max =  402.49  avg =  395.89
       resnet50_int8  min =  279.28  max =  369.81  avg =  302.96
      squeezenet_ssd  min =  130.08  max =  136.38  avg =  132.78
 squeezenet_ssd_int8  min =   97.66  max =  102.17  avg =   99.94
       mobilenet_ssd  min =  142.50  max =  147.67  avg =  145.35
  mobilenet_ssd_int8  min =  124.06  max =  125.40  avg =  124.56
      mobilenet_yolo  min =  335.56  max =  341.81  avg =  337.67
    mobilenet_yolov3  min =  342.10  max =  354.65  avg =  348.84
loop_count = 4
num_threads = 4
powersave = 2
gpu_device = -1
          squeezenet  min =   29.68  max =   37.17  avg =   32.55
     squeezenet_int8  min =   22.44  max =   26.28  avg =   23.43
           mobilenet  min =   46.23  max =   46.75  avg =   46.52
      mobilenet_int8  min =   34.25  max =   34.37  avg =   34.30
        mobilenet_v2  min =   29.91  max =   36.68  avg =   33.78
   mobilenet_v2_int8  min =   31.93  max =   36.64  avg =   34.73
          shufflenet  min =   16.09  max =   17.71  avg =   16.53
             mnasnet  min =   34.28  max =   34.56  avg =   34.42
     proxylessnasnet  min =   35.31  max =  105.03  avg =   67.16
           googlenet  min =   98.83  max =  103.55  avg =  101.04
      googlenet_int8  min =   77.47  max =   80.06  avg =   78.96
            resnet18  min =   94.75  max =  108.81  avg =  104.37
       resnet18_int8  min =   68.67  max =   69.41  avg =   69.01
             alexnet  min =  333.09  max =  334.06  avg =  333.53
               vgg16  min =  979.84  max = 1031.63  avg = 1015.87
          vgg16_int8  min =  426.72  max =  460.57  avg =  440.25
            resnet50  min =  462.01  max =  535.41  avg =  482.11
       resnet50_int8  min =  327.75  max =  333.08  avg =  331.69
      squeezenet_ssd  min =  136.53  max =  139.89  avg =  137.77
 squeezenet_ssd_int8  min =  112.72  max =  117.89  avg =  114.80
       mobilenet_ssd  min =  175.75  max =  177.17  avg =  176.53
  mobilenet_ssd_int8  min =   71.63  max =  131.70  avg =   95.69
      mobilenet_yolo  min =  189.68  max =  195.27  avg =  192.12
    mobilenet_yolov3  min =  199.68  max =  202.75  avg =  200.74

---- HOST_BUILD_PATH:build-android-armv7-vulkan ----
./build-android-armv7-vulkan/benchmark/benchncnn: 1 file pushed. 15.2 MB/s (5634628 bytes in 0.353s)
arm mali driver is too old
no vulkan device
Segmentation fault

ARMv8

---- HOST_BUILD_PATH:build-android-aarch64 ----
./build-android-aarch64/benchmark/benchncnn: 1 file pushed. 14.9 MB/s (3861624 bytes in 0.248s)
loop_count = 30
num_threads = 1
powersave = 2
gpu_device = -1
          squeezenet  min =   74.55  max =   90.57  avg =   79.59
     squeezenet_int8  min =   55.58  max =   68.79  avg =   62.37
           mobilenet  min =  121.62  max =  135.79  avg =  126.89
      mobilenet_int8  min =  100.52  max =  103.81  avg =  101.92
        mobilenet_v2  min =   83.93  max =   98.59  avg =   86.97
   mobilenet_v2_int8  min =  107.26  max =  115.27  avg =  111.07
          shufflenet  min =   34.72  max =   39.71  avg =   35.71
             mnasnet  min =   78.93  max =   85.89  avg =   81.85
     proxylessnasnet  min =   95.08  max =  100.35  avg =   97.16
           googlenet  min =  303.11  max =  329.98  avg =  314.69
      googlenet_int8  min =  216.67  max =  227.69  avg =  221.26
            resnet18  min =  274.83  max =  329.87  avg =  295.62
       resnet18_int8  min =  173.21  max =  183.78  avg =  179.98
             alexnet  min =  370.25  max =  373.95  avg =  371.30
               vgg16  min = 1827.30  max = 1923.14  avg = 1895.71
          vgg16_int8  min = 1112.88  max = 1165.14  avg = 1141.72
            resnet50  min =  625.61  max =  669.01  avg =  645.58
       resnet50_int8  min =  436.76  max =  463.67  avg =  446.46
      squeezenet_ssd  min =  224.27  max =  263.29  avg =  250.53
 squeezenet_ssd_int8  min =  147.33  max =  176.24  avg =  161.91
       mobilenet_ssd  min =  256.78  max =  277.79  avg =  267.79
  mobilenet_ssd_int8  min =  197.98  max =  207.53  avg =  201.26
      mobilenet_yolo  min =  575.37  max =  622.58  avg =  594.77
    mobilenet_yolov3  min =  593.14  max =  646.65  avg =  616.91
loop_count = 30
num_threads = 2
powersave = 2
gpu_device = -1
          squeezenet  min =   40.14  max =   50.61  avg =   42.96
     squeezenet_int8  min =   32.31  max =   40.19  avg =   36.14
           mobilenet  min =   67.06  max =   80.45  avg =   72.44
      mobilenet_int8  min =   55.08  max =   59.76  avg =   55.95
        mobilenet_v2  min =   51.76  max =  101.06  avg =   96.94
   mobilenet_v2_int8  min =  121.57  max =  124.04  avg =  123.43
          shufflenet  min =   44.60  max =   45.95  avg =   45.31
             mnasnet  min =   44.14  max =   93.43  avg =   67.05
     proxylessnasnet  min =   51.41  max =   65.30  avg =   59.82
           googlenet  min =  162.63  max =  192.40  avg =  175.51
      googlenet_int8  min =  121.24  max =  134.51  avg =  126.40
            resnet18  min =  158.97  max =  280.00  avg =  263.59
       resnet18_int8  min =   98.36  max =  215.57  avg =  124.50
             alexnet  min =  218.35  max =  219.02  avg =  218.67
               vgg16  min =  837.97  max = 1612.68  avg = 1067.26
          vgg16_int8  min =  632.97  max = 1294.97  avg =  958.40
            resnet50  min =  341.85  max =  693.17  avg =  439.81
       resnet50_int8  min =  259.45  max =  541.71  avg =  518.42
      squeezenet_ssd  min =  107.39  max =  131.17  avg =  124.62
 squeezenet_ssd_int8  min =   91.61  max =  107.65  avg =  100.15
       mobilenet_ssd  min =  134.31  max =  279.88  avg =  230.91
  mobilenet_ssd_int8  min =  230.49  max =  239.03  avg =  234.40
      mobilenet_yolo  min =  291.96  max =  621.08  avg =  452.18
    mobilenet_yolov3  min =  306.75  max =  648.73  avg =  473.47
loop_count = 30
num_threads = 4
powersave = 2
gpu_device = -1
          squeezenet  min =   46.78  max =   51.50  avg =   49.63
     squeezenet_int8  min =   37.11  max =   38.97  avg =   37.87
           mobilenet  min =   75.26  max =   76.59  avg =   76.08
      mobilenet_int8  min =   64.76  max =   67.07  avg =   65.90
        mobilenet_v2  min =   51.47  max =   59.79  avg =   55.19
   mobilenet_v2_int8  min =   64.40  max =   68.32  avg =   66.55
          shufflenet  min =   29.86  max =   31.13  avg =   30.59
             mnasnet  min =   50.98  max =   55.68  avg =   52.79
     proxylessnasnet  min =   60.25  max =   64.13  avg =   60.67
           googlenet  min =  180.96  max =  185.45  avg =  184.15
      googlenet_int8  min =  150.76  max =  154.75  avg =  152.60
            resnet18  min =  161.39  max =  166.26  avg =  163.97
       resnet18_int8  min =  127.88  max =  130.17  avg =  129.49
             alexnet  min =  321.55  max =  322.60  avg =  322.15
               vgg16  min =  886.23  max =  900.60  avg =  892.33
          vgg16_int8  min =  822.98  max =  835.05  avg =  833.14
            resnet50  min =  375.48  max =  382.05  avg =  377.87
       resnet50_int8  min =  307.11  max =  311.59  avg =  309.16
      squeezenet_ssd  min =  123.04  max =  125.96  avg =  124.93
 squeezenet_ssd_int8  min =  102.48  max =  112.02  avg =  107.77
       mobilenet_ssd  min =  150.26  max =  153.14  avg =  151.52
  mobilenet_ssd_int8  min =  125.05  max =  128.00  avg =  126.56
      mobilenet_yolo  min =  329.87  max =  339.76  avg =  336.38
    mobilenet_yolov3  min =  342.86  max =  354.08  avg =  350.19
---- HOST_BUILD_PATH:build-android-aarch64-vulkan ----
./build-android-aarch64-vulkan/benchmark/benchncnn: 1 file pushed. 23.6 MB/s (6982616 bytes in 0.282s)
arm mali driver is too old
no vulkan device
Segmentation fault
ysh329 commented 5 years ago

NCNN使用感受

编译、benchmark流程简约、干净、一步到位。文档大部分都在wiki,虽然外观体验不如TensorFlow Lite,但条理清楚。

ysh329 commented 5 years ago

ncnn/run_benchmark.sh · NotBad/dl-inference-benchmark - 码云 - 开源中国 https://gitee.com/yuens/dl-inference-benchmark/blob/master/ncnn/run_benchmark.sh

ysh329 commented 4 years ago

build_vulkan_armv7.sh

wget https://sdk.lunarg.com/sdk/download/1.1.114.0/linux/vulkansdk-linux-x86_64-1.1.114.0.tar.gz?Human=true -O vulkansdk-linux-x86_64-1.1.114.0.tar.gz
tar -xf vulkansdk-linux-x86_64-1.1.114.0.tar.gz

# setup env
export VULKAN_SDK=`pwd`/1.1.114.0/x86_64

##### android armv7 vulkan
mkdir -p build-android-armv7-vulkan
pushd build-android-armv7-vulkan
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI="armeabi-v7a" -DANDROID_ARM_NEON=ON -DANDROID_PLATFORM=android-24 -DNCNN_VULKAN=ON ..
make -j8
make install
popd
ysh329 commented 4 years ago

模型转换caffe2ncnn

##### linux host system with gcc/g++
mkdir -p build-host-gcc-linux
pushd build-host-gcc-linux
cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/host.gcc.toolchain.cmake -DNCNN_BUILD_TOOLS=ON ..
make -j30
make install
popd