Thank you for the good example of how to deploy Tensorflow models to Cortex microcontrollers.
I use Hello Edge work as a basis of my thesis work, where I'm classifying bird sounds with DS-CNNs. During the analysis I discovered that quantized Tensorflow model and the one deployed to Cortex-M microcontroller are not always giving out same classifications. I ran it on 100 different samples and found that 8% of times the classifications from Tensorflow and CMSIS-NN didn't match. I compared the outputs of each layer of the Tensorflow and CMSIS-NN models. The difference came from average pooling layer.
In arm_avepool_q7_HWC_nonsquare.c it is not checked how big or small sum values are. Therefore, it might happen that a value larger than 127 or smaller than -128 is set as an element in output array. As output array is q7_t, this will produce wrong results. In addition it's advisable to use floats for the sum variable and apply round() function to it, as done in the mfcc.cpp.
Second problem related to average pooling is in ds_cnn.cpp, as the output left shift is hardcoded to be 2. Instead it should be defined in ds_cnn.h, as other left and right shift parameters. The output left shift for average pooling layer can be calculated by taking the Q-format from the last DS layer and Q-format for the pooling layer, found by testing activation ranges. Example, if we found that best range to last DS layer output is [-32, 32) (Q5.2) and best range for pooling layer is [-1, 1) (Q0.7), we get that output left shift for average pooling layer should be 5.
After fixing these issues all classification results (100 samples) matched between quantized Tensorflow and CMSIS-NN model.
Thank you for the good example of how to deploy Tensorflow models to Cortex microcontrollers.
I use Hello Edge work as a basis of my thesis work, where I'm classifying bird sounds with DS-CNNs. During the analysis I discovered that quantized Tensorflow model and the one deployed to Cortex-M microcontroller are not always giving out same classifications. I ran it on 100 different samples and found that 8% of times the classifications from Tensorflow and CMSIS-NN didn't match. I compared the outputs of each layer of the Tensorflow and CMSIS-NN models. The difference came from average pooling layer.
In arm_avepool_q7_HWC_nonsquare.c it is not checked how big or small
sum
values are. Therefore, it might happen that a value larger than 127 or smaller than -128 is set as an element in output array. As output array isq7_t
, this will produce wrong results. In addition it's advisable to use floats for thesum
variable and applyround()
function to it, as done in the mfcc.cpp.Second problem related to average pooling is in ds_cnn.cpp, as the output left shift is hardcoded to be 2. Instead it should be defined in ds_cnn.h, as other left and right shift parameters. The output left shift for average pooling layer can be calculated by taking the Q-format from the last DS layer and Q-format for the pooling layer, found by testing activation ranges. Example, if we found that best range to last DS layer output is [-32, 32) (Q5.2) and best range for pooling layer is [-1, 1) (Q0.7), we get that output left shift for average pooling layer should be 5.
After fixing these issues all classification results (100 samples) matched between quantized Tensorflow and CMSIS-NN model.