airockchip / rknn-toolkit2

Other
991 stars 104 forks source link

基于RK3588 使用 rknn_benchmark 测试 YOLOv5s,发现3核心的帧率比1核心快不了多少 #129

Open BUG1989 opened 2 months ago

BUG1989 commented 2 months ago

开发板

ROCK 5 Model B

Log

单核心

radxa@rock-5b:~/rknn/test/rknn_benchmark_Linux$ ./rknn_benchmark ../yolov5s.rknn ../cat.jpg 10 1
rknn_api/rknnrt version: 2.1.0 (967d001cc8@2024-08-07T19:28:19), driver version: 0.9.6
total weight size: 7315840, total internal size: 6144000
total dma used size: 22917120
model input num: 1, output num: 3
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=326, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=57, scale=0.109260
  index=1, name=370, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=491520, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=46, scale=0.096394
  index=2, name=414, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=163840, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=43, scale=0.086990
custom string:
Warmup ...
   0: Elapse Time = 27.64ms, FPS = 36.18
   1: Elapse Time = 27.52ms, FPS = 36.34
   2: Elapse Time = 27.50ms, FPS = 36.36
   3: Elapse Time = 27.49ms, FPS = 36.38
   4: Elapse Time = 27.53ms, FPS = 36.32
Begin perf ...
   0: Elapse Time = 27.53ms, FPS = 36.32
   1: Elapse Time = 27.53ms, FPS = 36.32
   2: Elapse Time = 27.54ms, FPS = 36.31
   3: Elapse Time = 27.52ms, FPS = 36.34
   4: Elapse Time = 27.52ms, FPS = 36.34
   5: Elapse Time = 27.53ms, FPS = 36.32
   6: Elapse Time = 27.50ms, FPS = 36.36
   7: Elapse Time = 27.53ms, FPS = 36.33
   8: Elapse Time = 27.54ms, FPS = 36.31
   9: Elapse Time = 26.18ms, FPS = 38.20

Avg Time 27.39ms, Avg FPS = 36.507

Save output to rt_output0.npy
Save output to rt_output1.npy
Save output to rt_output2.npy
---- Top5 ----
3.714852 - 562726
3.714852 - 562727
3.605592 - 562646
3.605592 - 1108032
3.605592 - 1108033
---- Top5 ----
3.952167 - 275862
3.952167 - 276182
3.855773 - 275822
3.759379 - 275941
3.662984 - 275863
---- Top5 ----
3.740573 - 1147
3.740573 - 35109
3.653583 - 35129
3.566593 - 1167
3.566593 - 35089
radxa@rock-5b:~/rknn/test/rknn_benchmark_Linux$

三核心

radxa@rock-5b:~/rknn/test/rknn_benchmark_Linux$ ./rknn_benchmark ../yolov5s.rknn ../cat.jpg 10 7
rknn_api/rknnrt version: 2.1.0 (967d001cc8@2024-08-07T19:28:19), driver version: 0.9.6
total weight size: 7315840, total internal size: 6144000
total dma used size: 22917120
model input num: 1, output num: 3
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, w_stride = 640, size_with_stride=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=326, n_dims=4, dims=[1, 255, 80, 80], n_elems=1632000, size=1632000, w_stride = 0, size_with_stride=1638400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=57, scale=0.109260
  index=1, name=370, n_dims=4, dims=[1, 255, 40, 40], n_elems=408000, size=408000, w_stride = 0, size_with_stride=491520, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=46, scale=0.096394
  index=2, name=414, n_dims=4, dims=[1, 255, 20, 20], n_elems=102000, size=102000, w_stride = 0, size_with_stride=163840, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=43, scale=0.086990
custom string:

Warmup ...
   0: Elapse Time = 26.67ms, FPS = 37.50
   1: Elapse Time = 26.51ms, FPS = 37.73
   2: Elapse Time = 26.50ms, FPS = 37.74
   3: Elapse Time = 26.55ms, FPS = 37.67
   4: Elapse Time = 26.53ms, FPS = 37.69
Begin perf ...
   0: Elapse Time = 26.54ms, FPS = 37.68
   1: Elapse Time = 26.49ms, FPS = 37.74
   2: Elapse Time = 26.54ms, FPS = 37.68
   3: Elapse Time = 26.52ms, FPS = 37.71
   4: Elapse Time = 26.51ms, FPS = 37.71
   5: Elapse Time = 26.56ms, FPS = 37.65
   6: Elapse Time = 25.15ms, FPS = 39.76
   7: Elapse Time = 20.80ms, FPS = 48.07
   8: Elapse Time = 20.79ms, FPS = 48.10
   9: Elapse Time = 20.80ms, FPS = 48.09

Avg Time 24.67ms, Avg FPS = 40.535

Save output to rt_output0.npy
Save output to rt_output1.npy
Save output to rt_output2.npy
---- Top5 ----
3.714852 - 562726
3.714852 - 562727
3.605592 - 562646
3.605592 - 1108032
3.605592 - 1108033
---- Top5 ----
3.952167 - 275862
3.952167 - 276182
3.855773 - 275822
3.759379 - 275941
3.662984 - 275863
---- Top5 ----
3.740573 - 1147
3.740573 - 35109
3.653583 - 35129
3.566593 - 1167
3.566593 - 35089
radxa@rock-5b:~/rknn/test/rknn_benchmark_Linux$