Noise appear in uint8 model and wrong result in int model

ARM-software / armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn

https://developer.arm.com/products/processors/machine-learning/arm-nn

MIT License

1.17k stars 310 forks source link

Noise appear in uint8 model and wrong result in int model #557

Closed Rahn80643 closed 2 years ago

Rahn80643 commented 3 years ago

Noise appear in uint8 model and wrong result in int model

Hi, I've been testing our segmentation models that are converted in uint8 precision and int8 precision on different computation unit on Odroid. I found that the detected results of the float model on all computation units are the same, the detected results of the uint8 model contain some noise while executing at GpuAcc mode, and the detected result of int8 models are completely wrong at all computation units. More detailed information of the models is as follow:

The version of ArmNN is 21.05 in order to execute int8.
The uint8 model is converted based on the steps in Quantification (pb->tflite), Method 1
The int8 model is converted based on the steps in post_training_quantization, int only
Comparisons of different models on different computation unit:

Could the nose be removed while applying the uint8 model on GPU?
Is there any possible solution to the wrong results of int8 model on all computation unit?

Best regards, Rahn

Burton2000 commented 3 years ago

Just to check, do the results of all the models look correct when not using ArmNN e.g. when running them in the TensorFlow Lite interpreter?

For the int8 model are you correctly quantizing your input if you used integer only quantization?

Rahn80643 commented 3 years ago

hi,

The int8 and float tflite model was tested in TensorFlow Lite interpreter, and the results from these models are similar.

Screenshot from 2021-06-30 13-43-07

I had modified the input image and quantized the pixel values for int8 model, the pixel values fall between -128~ 127.
I followed (post training quantization)[https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only] to quantizing the int8 model.

morgolock commented 3 years ago

Hi @Rahn80643

Could you please share the models with us?

Rahn80643 commented 3 years ago

Hi, @morgolock

The attachment is the converted int8 model. The input name is input_tensor. The output name is final_output. The size of input tensor is: 360x 360x 3 (Wx Hx C) The size of output tensor is: 180x 180 (Wx H)

model_int8.lite.tar.gz

Best Regards, Rahn

morgolock commented 3 years ago

Hi @Rahn80643

Could you please share the original non converted model?

Rahn80643 commented 3 years ago

Hi, @morgolock

The attachment is the original pb model. The input name is input_tensor. The output name is final_output. The size of input tensor is: 360x 360x 3 (Wx Hx C) The size of output tensor is: 180x 180 (Wx H)

model.pb.tar.gz

Best Regards, Rahn

MikeJKelly commented 3 years ago

Hi @Rahn80643

what images did you use for your representative_dataset when performing your post training quantization? Do you have a link to the image set? How did you run the models through armnn?

Best regards, Mike

Rahn80643 commented 3 years ago

Hi, The link below are the images I used for post-training quantization. google drive link The image preprocessing function, please note that the images should be resized to 360x 360. 20210707_imgProc

I modified the sample code in ML-examples/armnn-mobilenet-quant to exeucte my model, and the modifications are as follows.

(mobilenetv1_quant_tflite.cpp) Add an extra container TContainer_int8 for the int8 model.
(mobilenetv1_quant_tflite.cpp) Modify normParams.scale, normParams.mean, and normParams.stddev for the int8 model.
(mobilenetv1_quant_tflite.cpp) Preprocess images by calling PrepareImageTensor_int which is modified in utils.cpp.
(utils.cpp) Cast the pixel values to uint_8 and return the preprocessed image.
(mobilenetv1_quant_tflite.cpp) Execute inference and get the output result

Best Regards, Rahn

morgolock commented 3 years ago

Hi @Rahn80643

Could you please also share the uint8 model with us?

Rahn80643 commented 3 years ago

Hi, The following attachment is the converted tflite uint8 model. model_uint8.lite.tar.gz

Best Regards, Rahn

MikeJKelly commented 3 years ago

Hi @Rahn80643

there were some int8 fixes recently in master, so I ran one of your images through "model_int8.lite" using the version of ArmNN and ComputeLibrary on master and got good results on CpuRef but very poor results on both CpuAcc and GpuAcc.

When we looked into the model a bit more we saw that "model_int8.lite" uses per-axis quantization for the weights in its convolution and depthwise convolution layers but ComputeLibrary doesn't have per-axis supports it applies quantization along the Channel dimension.

I'll add a test to the CL and Neon backends to check for situations where the axis is not equal to the channel. Those cannot be supported on CpuAcc or GpuAcc right now.

Best regards, Mike

Rahn80643 commented 3 years ago

Hi, I was informed to use armnn 21.05 with this patch to test issue of wrong results in qasymms8 (int8) models. However, I got errors during compiling this patch with Compute Library 21.05: armnn/src/backends/cl/workloads/ClAbsWorkload.hpp:11:10: fatal error: arm_compute/runtime/CL/functions/CLElementwiseUnaryLayer.h: No such file or directory #include <arm_compute/runtime/CL/functions/CLElementwiseUnaryLayer.h>

2181230

I also tried to compile this arm nn patch with compute library 5907 patch, but the error message above still appear. Does this issue occur because of the settings in makefiles?

Best regards, Rahn

Rahn80643 commented 3 years ago

Hi, I've updated Arm nn to 21.08, but the inferred results from int8 model are still wrong, is the issue caused by use biases with different quantization scales in int8 tflite models?

morgolock commented 2 years ago

Hi @Rahn80643

This patch in ACL fixes the problem in the CPU backend

https://review.mlplatform.org/c/ml/ComputeLibrary/+/6378

Hope this helps.

Rahn80643 commented 2 years ago

Hi @morgolock,

Thank you for informing the ACL patch, the inference results of int8 model from CPUAcc mode is identical to those from TFLite API.

We want to ask another question: If we apply the int8 model to ethos NPU, will the inferred results from NPU contain noise, or the results be identical to TFLite API results?

Best regards, Rahn

morgolock commented 2 years ago

Hi @Rahn80643

I'd advice to direct any queries about the ethos NPU driver to https://github.com/ARM-software/ethos-n-driver-stack/issues

Hope this helps.

Rahn80643 commented 2 years ago

Hi, I have asked the related question on https://github.com/ARM-software/ethos-n-driver-stack/issues, thanks you so much for providing the information above

Rahn