Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
19.72k stars 4.1k forks source link

yolact实例 #1883

Open chang248 opened 4 years ago

chang248 commented 4 years ago

我在使用给的yolact的实例时,发现即使enable了vulkan,运算依然主要依靠cpu,常常cpu使用率会到90%多,这是为什么?有没有办法能改成用gpu运算?opencv,cuda都是设置好的。

nihui commented 4 years ago

可能没有跑在gpu上? 如果有gpu的话,跑的时候,开头会打印几行gpu信息

chang248 commented 4 years ago

[0 GeForce GTX 1660 Ti] queueC=2[8] queueG=0[16] queueT=1[2] [0 GeForce GTX 1660 Ti] buglssc=0 bugsbn1=0 buglbia=0 bugihfa=0 [0 GeForce GTX 1660 Ti] fp16p=1 fp16s=1 fp16a=1 int8s=1 int8a=1 打印了这几行,这应该代表用gpu了吧?然而还是90%左右的cpu使用,1%左右的gpu使用。

nihui commented 4 years ago

不正常 你用gpu跑benchncnn看看呢?

chang248 commented 4 years ago

跑完结果是这样的: [0 GeForce GTX 1660 Ti] queueC=2[8] queueG=0[16] queueT=1[2] [0 GeForce GTX 1660 Ti] buglssc=0 bugsbn1=0 buglbia=0 bugihfa=0 [0 GeForce GTX 1660 Ti] fp16p=1 fp16s=1 fp16a=1 int8s=1 int8a=1 loop_count = 20 num_threads = 8 powersave = 0 gpu_device = 0 cooling_down = 0 squeezenet min = 1.95 max = 2.19 avg = 2.13 mobilenet min = 1.99 max = 2.17 avg = 2.05 mobilenet_v2 min = 2.49 max = 2.85 avg = 2.69 mobilenet_v3 min = 3.01 max = 3.35 avg = 3.21 shufflenet min = 2.06 max = 2.31 avg = 2.23 shufflenet_v2 min = 2.62 max = 2.88 avg = 2.80 mnasnet min = 2.73 max = 4.16 avg = 2.85 proxylessnasnet min = 2.72 max = 2.93 avg = 2.87 googlenet min = 6.92 max = 7.42 avg = 7.29 resnet18 min = 3.02 max = 3.30 avg = 3.21 alexnet min = 3.05 max = 3.56 avg = 3.39 vgg16 min = 13.40 max = 15.02 avg = 14.21 resnet50 min = 6.20 max = 6.42 avg = 6.31 squeezenet_ssd min = 7.41 max = 8.09 avg = 7.79 mobilenet_ssd min = 4.07 max = 5.02 avg = 4.32 mobilenet_yolo min = 5.59 max = 6.74 avg = 5.91 mobilenetv2_yolov3 min = 3.85 max = 4.29 avg = 4.14

但是跑的时候我看了一下,还是在用cpu,gpu最多也只到5%,cpu甚至到了100%。

nihui commented 4 years ago

这应该是windows任务管理器显示问题,可以用的gpu-z看负载

chang248 commented 4 years ago

用gpu-z看了一下,cpu使用98%,gpu 40%-50%。这是正常情况吗? 非常感谢大佬的帮忙。

chang248 commented 4 years ago

@nihui 我用yolact去处理实时的视频,发现最多到12帧左右,但gpu只用到了40%,然后我测了下时间。 ex.extract("619", maskmaps);// 138x138 x 32 ex.extract("816", location);// 4 x 19248 ex.extract("818", mask);// maskdim 32 x 19248 ex.extract("820", confidence);// 81 x 19248 发现这四行代码用了接近100ms每帧,有没有什么办法可以提高gpu的使用率,让处理速度再快一点?

Mouna96 commented 2 years ago

@chang248 I am facing the same issue, those 4 lines make processing slow. Did you find how to solve it ? also can you tell me how did you enable gpu for executing yolact?