yuenshome / yuenshome.github.io

https://yuenshome.github.io
MIT License
81 stars 15 forks source link

RapidNet->TNN #116

Open ysh329 opened 4 years ago

ysh329 commented 4 years ago

2019年3月23日,在北京,由腾讯优图主办,腾讯云、腾讯 Ai Lab 和极客邦协办,主题为「智变未来-浅谈人工智能技术应用与实践」的技术沙龙活动。提到了RapidNet。分享的是腾讯优图的AI 应用研究高级研究员王川南,其题目为《从硬件到算法——腾讯优图AI 终端产品实践》

腾讯优图研发了移动端高性能前向计算框架 NCNN 以及深度学习推断框架 RapidNet,两者皆由腾讯优图自主开发,其中前者已对外开源Tencent/NCNN

image

而后者 RapidNet 则是一款深度学习推断框架,同时拥有跨平台、高性能、模型压缩、代码裁剪等众多突出优势。其在各个平台提供了统一的接口调用,以及同步的优化策略。面对异构网络,RapidNet 可以有效发挥硬件加速技术,并保证多核 CPU/GPU 的任务调度。至于面对量化难点,RapidNet 可以确保手势检测、跟踪等模型效果在大部分机型上提升 20%—40%,同时精度降低平均在 0.5 % 以内。

ysh329 commented 4 years ago

早在2017年,在另一篇题为《腾讯QQ空间超分辨率技术TSR:为用户节省3/4流量,处理效果和速度超谷歌RAISR》文章中,也能看到RapidNet的身影。文中提到

基于TSR衍生出来的深度学习框架RapidNet,对比Caffe2与TensorFlow框架,性能提升平均达到20倍,且能够把深度学习落地到普通手机。

据了解 RapideNet 最早于16年因人脸核身项目而开发

ysh329 commented 4 years ago

Benchmark

ysh329 commented 4 years ago

Android平台交叉编译Benchmark

参考[从源代码编译]()(compile.md)和[测试方法]()(test.md)的文档,基于Ubuntu的交叉编译环境及流程如下:

  1. cmake(使用3.6及以上版本)
  2. 下载ndk版本(>=15c) https://developer.android.com/ndk/downloads
  3. 配置环境变量 export ANDROID_NDK=<ndk_path>
  4. ubuntu: sudo apt-get install attr
  5. cd <path_to_tnn>/scripts
  6. 编辑build_android.sh修改配置选项开头的BENMARK_MODE="ON"
  7. 执行./build_android.sh

编译完成后,在当前目录的release目录下生成对应的armeabi-v7a库,arm64-v8a库和include头文件。因为没有在本地编译,而是服务器编译32线程同时,还是挺快的。

因为打开了BENCHMARK模式,编译完成后,build目录下会生成测试可执行文件TNNTestbuild32/test/TNNTesth和build64/test/TNNTest,可在Linux, 安卓ADB等环境下直接运行。

Parameter -mp is not set
    -h                      print a usage message.
    -mt "<model type>"    specify model type: TNN, OPENVINO, COREML, SNPE.
    -mp "<model path>"    specify model path: tnn proto path, openvino xml path, coreml mlmodel path, snpe dlc path.
    -dt "<device type>"   specify tnn device type: CPU, X86, ARM, CUDA, METAL, OPENCL, default is CPU.
    -lp "<library path>"  specify tnn NetworkConfig library_path. For metal, it is the tnn.metallib full path
    -ic "<number>"        iterations count (default 1).
    -wc "<number>"        warm up count (default 0).
    -ip "<path>"          input file path
    -op "<path>"          output file path
    -dl "<device list>"   device list(eg: 0,1,2,3)
    -th "<thread umber>"  cpu thread num(eg: 0,1,2,3, default 1)
    -it "<input type>"    input format(0: nchw float, 1:bgr u8, 2, gray u8)
    -fc "<format for compare>"output format for comparison
    -pr "<precision >"    compute precision(HIGH, NORMAL, LOW)

在其中一篇文档看到支持NCNN模型,想着之前已有转换好的NCNN的Caffe-MobileNetV1/V2,可以直接拿来做Benchmakr。另外,这个TNNTest的源码位置在<path-to-tnn>/test/test.cc

ysh329 commented 4 years ago

模型转换

docker pull turandotkay/tnn-convert

ysh329 commented 4 years ago

benchmark 6月16日代码

9386a50d9dccc

先前没算数据拷贝

845

model_name,platform,soc,product,power_mode,backend,cpu_thread_num,avg,max,min,repeats,warmup caffe_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,1,74.051,74.452,73.070,100,20 caffe_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,2,40.272,42.476,38.518,100,20 caffe_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,4,22.535,23.114,21.458,100,20 caffe_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,OPENCL,1,10.002,10.410,9.768,100,20 caffe_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,1,65.493,73.866,60.222,100,20 caffe_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,2,33.886,35.865,33.135,100,20 caffe_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,4,20.253,20.391,20.120,100,20 caffe_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,OPENCL,1,9.611,10.672,9.470,100,20 caffe_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,1,75.402,80.475,73.167,100,20 caffe_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,2,38.338,39.492,37.980,100,20 caffe_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,4,22.278,23.319,21.110,100,20 caffe_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,OPENCL,1,9.888,10.223,9.551,100,20 caffe_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,1,66.924,69.279,62.455,100,20 caffe_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,2,34.141,34.331,33.967,100,20 caffe_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,4,19.782,20.767,19.422,100,20 caffe_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,OPENCL,1,9.801,10.037,9.499,100,20 tf_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,1,89.797,91.649,87.120,100,20 tf_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,2,46.830,48.460,46.010,100,20 tf_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,ARM,4,24.643,24.798,24.511,100,20 tf_mobilenetv1.opt,android-armv7,sdm845,MI 8,big_cores,OPENCL,1,9.763,10.311,9.457,100,20 tf_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,1,75.530,82.265,73.722,100,20 tf_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,2,39.692,40.096,38.889,100,20 tf_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,ARM,4,21.661,21.759,21.529,100,20 tf_mobilenetv1.opt,android-armv8,sdm845,MI 8,big_cores,OPENCL,1,9.718,10.117,9.429,100,20 tf_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,1,55.530,56.103,53.653,100,20 tf_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,2,29.418,31.188,29.110,100,20 tf_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,ARM,4,17.794,18.265,17.633,100,20 tf_mobilenetv2.opt,android-armv7,sdm845,MI 8,big_cores,OPENCL,1,8.318,8.583,7.919,100,20 tf_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,1,46.463,47.034,46.053,100,20 tf_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,2,26.326,26.481,26.085,100,20 tf_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,ARM,4,16.161,16.302,16.048,100,20 tf_mobilenetv2.opt,android-armv8,sdm845,MI 8,big_cores,OPENCL,1,8.211,8.424,7.866,100,20

ysh329 commented 4 years ago

855

ysh329 commented 4 years ago

970

model_name,platform,soc,product,power_mode,backend,cpu_thread_num,avg,max,min,repeats,warmup caffe_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,1,107.556,111.029,105.036,100,20 caffe_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,2,58.156,83.359,56.328,100,20 caffe_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,4,36.475,51.867,32.798,100,20 caffe_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,OPENCL,1,12.236,20.555,9.001,100,20 caffe_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,1,88.486,93.080,86.786,100,20 caffe_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,2,48.816,52.496,46.131,100,20 caffe_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,4,29.331,34.217,26.482,100,20 caffe_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,OPENCL,1,11.215,14.986,10.441,100,20 caffe_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,1,111.791,136.880,95.845,100,20 caffe_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,2,58.194,87.362,53.886,100,20 caffe_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,4,35.895,39.189,31.538,100,20 caffe_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,OPENCL,1,23.235,24.123,22.595,100,20 caffe_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,1,89.151,106.630,76.145,100,20 caffe_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,2,48.145,52.017,43.836,100,20 caffe_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,4,28.690,31.582,25.866,100,20 caffe_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,OPENCL,1,25.101,26.068,24.414,100,20 tf_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,1,108.062,110.975,104.464,100,20 tf_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,2,60.862,85.881,58.500,100,20 tf_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,4,39.852,62.611,34.951,100,20 tf_mobilenetv1.opt,android-armv7,kirin970,ALP-TL00,big_cores,OPENCL,1,20.759,22.477,20.068,100,20 tf_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,1,96.177,123.638,88.203,100,20 tf_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,2,49.421,51.116,48.243,100,20 tf_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,4,32.452,36.848,28.647,100,20 tf_mobilenetv1.opt,android-armv8,kirin970,ALP-TL00,big_cores,OPENCL,1,23.173,24.346,21.782,100,20 tf_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,1,73.605,76.404,70.380,100,20 tf_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,2,42.412,66.658,40.365,100,20 tf_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,ARM,4,29.919,63.733,25.373,100,20 tf_mobilenetv2.opt,android-armv7,kirin970,ALP-TL00,big_cores,OPENCL,1,11.244,12.342,10.416,100,20 tf_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,1,58.772,60.373,57.735,100,20 tf_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,2,36.215,41.180,33.666,100,20 tf_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,ARM,4,25.223,29.250,21.868,100,20 tf_mobilenetv2.opt,android-armv8,kirin970,ALP-TL00,big_cores,OPENCL,1,11.831,14.413,11.245,100,20