Package Filename /root/mnasnet_1.0_224_1_default_1/
===================================
MODEL_LOAD takes 27.815 ms
PREPARE takes 3220.900 ms
EXECUTE takes 16.578 ms
- MEAN : 16.578 ms
- MAX : 28.282 ms
- MIN : 14.836 ms
- GEOMEAN : 16.434 ms
===================================
No CL_Tuner
Package Filename /root/mnasnet_1.0_224_1_default_1/
===================================
MODEL_LOAD takes 11.660 ms
PREPARE takes 3142.837 ms
EXECUTE takes 19.295 ms
- MEAN : 19.295 ms
- MAX : 25.660 ms
- MIN : 17.280 ms
- GEOMEAN : 19.198 ms
===================================
I applied OpenCL Tuner to the acl_cl backend. OpenCL Tuner finds the optimal local work size for GPU acceleration tuning parameters. https://developer.arm.com/solutions/machine-learning-on-arm/developer-material/how-to-guides/implement-a-neural-style-transfer-on-android-with-arm-nn-apis/tuning-performance-with-opencl-tuner
I am going to introduce the OpenCL Tuner with Draft PR
Refer to the below test result. Model : mnasnet https://tfhub.dev/tensorflow/lite-model/mnasnet_1.0_224/1/default/1 CL_Tuner
No CL_Tuner