Using the tflite benchmark executable on my NXP IMX8 board to profile my quantized model (found on this repo under /tensorflow/lite/tools/benchmark/ ) I can get all operation runtime details when running on the CPU, but none is displayed when the param --external_delgate_path=/usr/lib/libvx_delegate.so is given , expected one Vx Delegate Node for the whole model. I would like to have the individual operator details, what is the runtime for each Conv etc.. . I have seen this topic for the GPU, is it the same issue for the vx_delegate ?
Given the documentation from NXP (page 14), I run the benchmark tool with the following arguments:
./benchmark_model --graph=my_model.quant.tflite --enable_op_profiling=true --max_delegated_partitions=1000 --external_delgate_path=/usr/lib/libvx_delegate.so
Using the tflite benchmark executable on my NXP IMX8 board to profile my quantized model (found on this repo under /tensorflow/lite/tools/benchmark/ ) I can get all operation runtime details when running on the CPU, but none is displayed when the param --external_delgate_path=/usr/lib/libvx_delegate.so is given , expected one Vx Delegate Node for the whole model. I would like to have the individual operator details, what is the runtime for each Conv etc.. . I have seen this topic for the GPU, is it the same issue for the vx_delegate ?
Given the documentation from NXP (page 14), I run the benchmark tool with the following arguments:
./benchmark_model --graph=my_model.quant.tflite --enable_op_profiling=true --max_delegated_partitions=1000 --external_delgate_path=/usr/lib/libvx_delegate.so