Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.15k stars 4.14k forks source link

caffe model int8 inference slower than fp32 inference #842

Closed happyboyneu closed 5 years ago

happyboyneu commented 5 years ago

Hello,I have a resnet50 caffe model,and use ncnn do inference on armeabi-v7a, and my time is 1534ms. I use Caffe-Int8-Convert-Tools canvert my model to int8 and run inference,my time is 2631ms. It's strange that the int8 time is longer.I build the android .so as the guide says.I want to konw is there something i miss or somebody haa the same result?

BUG1989 commented 5 years ago

on arm64-v8a platform.I'm doing my best to optimize it

happyboyneu commented 5 years ago

on arm64-v8a platform.I'm doing my best to optimize it

Mine is armeabi-v7a,not arm64-v8a.

BUG1989 commented 5 years ago

@happyboyneu Can you show me the running log of benchncnn with your armv7a platform.

happyboyneu commented 5 years ago

@happyboyneu Can you show me the running log of benchncnn with your armv7a platform.

shell@firefly:/data/local/tmp # ./benchncnn 4 4 0 WARNING: linker: ./benchncnn: unused DT entry: type 0x6ffffffe arg 0xce4 WARNING: linker: ./benchncnn: unused DT entry: type 0x6fffffff arg 0x3 loop_count = 4 num_threads = 4 powersave = 0 gpu_device = -1 squeezenet min = 152.45 max = 152.98 avg = 152.69 mobilenet min = 272.45 max = 273.12 avg = 272.77 mobilenet_v2 min = 201.46 max = 202.19 avg = 201.93 shufflenet min = 83.66 max = 84.14 avg = 83.91 mnasnet min = 170.46 max = 170.55 avg = 170.52 proxylessnasnet min = 197.33 max = 197.65 avg = 197.57 googlenet min = 602.35 max = 628.81 avg = 610.83 resnet18 min = 538.48 max = 538.72 avg = 538.60 alexnet min = 560.04 max = 562.34 avg = 560.72 vgg16 min = 2679.80 max = 2685.68 avg = 2682.01 squeezenet-ssd min = 328.99 max = 330.00 avg = 329.32 mobilenet-ssd min = 516.08 max = 517.31 avg = 516.62 mobilenet-yolo min = 1286.66 max = 1286.95 avg = 1286.75 mobilenet-yolov3 min = 1258.53 max = 1260.10 avg = 1259.15

nihui commented 5 years ago

massive optimization for arm64-v8a has been landed in latest ncnn int8 inference code