I use the application mali_cl_peak_flops_example to test my mali gpu computation capability. Then I test the resnet50 inference with the gpu. I monitor the gpu utilization rate with Streamline, is nearly 100% (GPU Activate). The resnet50 computation(MACs) can get by onnx_tool. Then I compute the gpu capability utilization rate:
resnet50_model_computation / inference_time_after_warm_up / mali_gpu_computation_capability x 100%
The result is less than 15%, I think is too low.
The resnet50 is 1x3x224x224, and download from net, not a special one.
Do you test this rate before ?
I use the application mali_cl_peak_flops_example to test my mali gpu computation capability. Then I test the resnet50 inference with the gpu. I monitor the gpu utilization rate with Streamline, is nearly 100% (GPU Activate). The resnet50 computation(MACs) can get by onnx_tool. Then I compute the gpu capability utilization rate:
resnet50_model_computation / inference_time_after_warm_up / mali_gpu_computation_capability x 100%
The result is less than 15%, I think is too low. The resnet50 is 1x3x224x224, and download from net, not a special one. Do you test this rate before ?