Quantize precision - Githubissues

robinfru commented 11 months ago

If a model is not quantize, the result is not correct. Why ? Maybe it's a datatype problem.

If a model is quantized, the precision is affected and the result is not correct. Maybe the MSE of dataset is too low. The data should be normalize and it's not.

robinfru commented 11 months ago

Change the data reprensation has some effects on the output :

diff --git a/convert_to_tflite.py b/convert_to_tflite.py
index 7fb68a3..8d1f7a6 100644
--- a/convert_to_tflite.py
+++ b/convert_to_tflite.py
@@ -96,7 +96,7 @@ def main() -> None:
     print("Convert and save tf lite model - quantized")    
     converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
     converter.optimizations = [tf.lite.Optimize.DEFAULT]
-    converter.representative_dataset = representative_data_gen
+    # converter.representative_dataset = representative_data_gen
     # converter.target_spec.supported_types = [tf.float16]
     # converter.exclude_conversion_metadata = True
     # converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

If the representative_dataset is commented, the result of gan_0_quant is :

And if representative_dataset is not commented, the result of gan_0_quant is :

Hypothese

Commented the representative_dataset is correct, but the precision is affected. During the process there is a division by 0 and the output is nan.

robinfru commented 11 months ago

Tips from Nuria

Try to quantize with pytorch before export.
Try with TFLite without CMSIS_NN optimizations.
Try CMSIS_NN and find if there are some parameters to change.

Quantification can be static or dynamic. The embedded target may not support dynamic quantization, and static quantization may not be performed correctly because the data is not representative of the dataset.

Try extending the representative data on the validation dataset

robinfru commented 11 months ago

I try to extend the representative dataset. The result of the evaluation is good on the host, but not on the target. On the host the mean squared error is a little bit too high...

test_dataloader = None

def representative_data_gen():
    global test_dataloader
    iterator = iter(test_dataloader)
    length = len(iterator)

    for i in range(length):
        # Model has only one input so each data point has one element.
        data = next(iterator)
        data = data[0][:, : -1]
        data = data.numpy()

        yield [data.astype(np.float32)]

On PC :

On Target :

We have the same result without CMSIS_NN optmizations (except time of inference)

robinfru commented 11 months ago

Tips from Nuria :

Try with LSTM if we have the same problem
- The conversion dont work. The result is always 0 on the host. No need to test on the target.
Try with STMCubeAI
- The model is working for the first input. For the other, the model is wrong. It's like the inputData are not the same as on the host. But the inputData are the same.
Try on RPI
Try with STMCubeAI with model no quantize
- The result is one more time incorrect
Try to implement the model with arm layer and import wights from pytorch. (Romain NXP Cup)
Try to infer on host with C++ API
- AAAHHHH : The results are not the same as in python. It's a TFLite framework problem maybe.

robinfru commented 11 months ago

Solve : THe input tensor was not filling correctly

AII4-0 / TinyML-STM32

Quantize precision #1

Hypothese