ARM-software / ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
MIT License
2.75k stars 767 forks source link

Benchmark quantify test #1077

Closed wenhyan closed 7 months ago

wenhyan commented 8 months ago

Hello Problem description: I want to benchmark the mobilenet_v2 by example "graph_mobilenet_v2.cpp" and the model file is const std::string model_path = "/cnn_data/mobilenet_v2_1.0_224_model/"; I find the model file mobilenet_v2_1.0_224.tgz. Do you have some script to export the model params and quantify them? Or I need quantify the params in inference process?

morgolock commented 8 months ago

Hi @wenhyan

The best way to assess performance for a model is using ArmNN's ExecuteNetwork with the option -e 1.

You can download the ArmNN prebuilt binary from https://github.com/ARM-software/armnn/releases/tag/v23.08 which include ExecuteNetwork.

You will need to get the tflite file for mobilenet_v2 or any other model you would like to test.

Then you can easily run the model as shown below

/armnn/main# LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./ExecuteNetwork -m ../../tflite_models/mobilenet_v2_1.0_224_quant.tflite -c CpuAcc --iterations=5    -N
Warning: No input files provided, input tensors will be filled with 0s.
Info: ArmNN v33.0.0
Info: Initialization time: 15.52 ms.
Info: Optimization time: 10.20 ms
Info: Network will be executed 5 times successively. The input-tensor-data files will be reused recursively if the user didn't provide enough to cover each execution.
Warning: The input data was generated, note that the output will not be useful
Info: Printing outputs to console is disabled.
===== Network Info =====
Inputs in order:
input, [1,224,224,3], QAsymmU8 Quantization Offset: 128 Quantization scale: 0.0078125
Outputs in order:
output, [1,1001], QAsymmU8 Quantization Offset: 58 Quantization scale: 0.0988925

Info: Execution time: 72.84 ms.
Info: Inference time: 72.96 ms

Info: Execution time: 26.93 ms.
Info: Inference time: 27.03 ms

Info: Execution time: 23.66 ms.
Info: Inference time: 23.76 ms

Info: Execution time: 23.56 ms.
Info: Inference time: 23.65 ms

Info: Execution time: 23.69 ms.
Info: Inference time: 23.79 ms

Info: Shutdown time: 3.65 ms.

The -e 1 option will output detailed profiling information which you can then analyze using scripts. Please see also: https://github.com/ARM-software/ComputeLibrary/issues/950

Hope this helps,