Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.23k stars 4.15k forks source link

The running speed of ncnn-vulkan is slower than the ncnn-cpu? #2743

Open GuideWsp opened 3 years ago

GuideWsp commented 3 years ago

I build the ncnn with vulkan on my macbook pro. when I run the benchmark and found the vulkan version is slower than the cpu verion.

The following is cpu verion. $ ../build-vulkan/benchmark/benchncnn 20 4 0 -1 loop_count = 20 num_threads = 4 powersave = 0 gpu_device = -1 cooling_down = 1 squeezenet min = 5.11 max = 5.69 avg = 5.27 squeezenet_int8 min = 35.68 max = 41.72 avg = 37.79 mobilenet min = 6.84 max = 8.82 avg = 7.44 mobilenet_int8 min = 74.62 max = 88.52 avg = 81.84 mobilenet_v2 min = 4.78 max = 12.30 avg = 8.00 mobilenet_v3 min = 4.09 max = 8.69 avg = 5.77 shufflenet min = 6.06 max = 10.43 avg = 7.95 shufflenet_v2 min = 5.23 max = 7.29 avg = 5.84 mnasnet min = 4.32 max = 6.14 avg = 5.47 proxylessnasnet min = 5.66 max = 7.38 avg = 6.21 efficientnet_b0 min = 7.27 max = 9.25 avg = 7.60 regnety_400m min = 13.16 max = 14.11 avg = 13.59 blazeface min = 1.57 max = 2.83 avg = 2.03 googlenet min = 19.07 max = 26.32 avg = 20.55 googlenet_int8 min = 99.94 max = 165.68 avg = 126.40 resnet18 min = 27.38 max = 31.99 avg = 28.30 resnet18_int8 min = 58.62 max = 104.79 avg = 82.64 alexnet min = 27.87 max = 29.43 avg = 28.51 vgg16 min = 90.04 max = 108.21 avg = 97.25 vgg16_int8 min = 274.17 max = 301.51 avg = 284.13 resnet50 min = 39.50 max = 52.41 avg = 43.46 resnet50_int8 min = 275.33 max = 364.16 avg = 295.09 squeezenet_ssd min = 34.14 max = 37.58 avg = 35.50 squeezenet_ssd_int8 min = 49.95 max = 66.82 avg = 53.87 mobilenet_ssd min = 13.69 max = 20.12 avg = 15.75 mobilenet_ssd_int8 min = 144.61 max = 170.56 avg = 152.37 mobilenet_yolo min = 31.74 max = 42.10 avg = 34.60 mobilenetv2_yolov3 min = 18.11 max = 24.41 avg = 19.31 yolov4-tiny min = 34.93 max = 43.55 avg = 36.86

and the following is vulkan-version [0 Intel(R) Iris(TM) Plus Graphics 655] queueC=0[1] queueG=0[1] queueT=0[1] [0 Intel(R) Iris(TM) Plus Graphics 655] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0 [0 Intel(R) Iris(TM) Plus Graphics 655] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1 [0 Intel(R) Iris(TM) Plus Graphics 655] subgroup=0 basic=0 vote=0 ballot=0 shuffle=0 loop_count = 20 num_threads = 4 powersave = 0 gpu_device = 0 cooling_down = 1 squeezenet min = 9.36 max = 14.37 avg = 11.53 squeezenet_int8 min = 34.04 max = 56.56 avg = 42.23 mobilenet min = 9.04 max = 12.62 avg = 10.53 mobilenet_int8 min = 73.54 max = 80.95 avg = 76.63 mobilenet_v2 min = 13.86 max = 16.72 avg = 14.75 mobilenet_v3 min = 14.08 max = 17.77 avg = 16.48 shufflenet min = 10.94 max = 16.75 avg = 12.31 shufflenet_v2 min = 12.64 max = 15.94 avg = 13.80 mnasnet min = 14.60 max = 18.26 avg = 15.86 proxylessnasnet min = 15.03 max = 22.51 avg = 17.28 efficientnet_b0 min = 20.00 max = 27.08 avg = 22.85 regnety_400m min = 18.51 max = 21.22 avg = 19.18 blazeface min = 4.40 max = 6.45 avg = 4.87 googlenet min = 23.60 max = 37.03 avg = 32.33 googlenet_int8 min = 101.09 max = 115.62 avg = 106.02 resnet18 min = 27.92 max = 33.67 avg = 29.43 resnet18_int8 min = 59.91 max = 67.26 avg = 63.45 alexnet min = 25.93 max = 30.92 avg = 27.97 vgg16 min = 83.82 max = 88.50 avg = 85.58 vgg16_int8 min = 270.59 max = 291.21 avg = 277.76 resnet50 min = 39.01 max = 42.40 avg = 40.53 resnet50_int8 min = 264.25 max = 294.19 avg = 273.57 squeezenet_ssd min = 30.50 max = 44.29 avg = 36.31 squeezenet_ssd_int8 min = 51.39 max = 61.68 avg = 53.76 mobilenet_ssd min = 23.94 max = 27.97 avg = 25.07 mobilenet_ssd_int8 min = 146.09 max = 212.84 avg = 161.79 mobilenet_yolo min = 42.57 max = 59.62 avg = 44.86 mobilenetv2_yolov3 min = 21.30 max = 31.96 avg = 24.69 yolov4-tiny min = 42.21 max = 48.10 avg = 45.22

can any one give me a help?

ncnnnnn commented 3 years ago

Not all milk is called terenzo.

Xelawk commented 3 years ago

I have the same issue with my mbp, have you finished it?