Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.46k stars 630 forks source link

Can't set quantization strategy either with `quantize_model()` or `set_quantize_strategy()` functions in TF2 #1427

Open jonGuti13 opened 5 months ago

jonGuti13 commented 5 months ago

Issue title:

Can't set quantization strategy either with quantize_model() or set_quantize_strategy() functions in TF2.

Description:

When attempting to quantize my floating-point trained model using a simple quantization strategy, which is specified either in the **kwargs of quantize_model() or in the json file of the set_quantize_strategy() functions, I end up having a model with the default quantization strategy.

This is what Netron app shows for an activation (method: 1): activationProperties

and for a Conv2D layer (method: 1 too) conv2Dproperties

even though I specify that I want Vitis AI Quantizer to use method 0 for those operators. The Python code I have used to perform the quantization is the following one:

#Using quantize_model()
quantized_model = quantizer.quantize_model(calib_dataset=calib_generator, calib_steps=calib_steps, include_fast_ft=calib_fast_finetuning, fast_ft_epochs = calib_fast_ft_epochs, include_cle=True, fold_conv_bn=True, activation_symmetry = False, activation_method = 0, weight_method = 0)
#Using set_quantize_strategy()
quantizer = vitis_quantize.VitisQuantizer(trained_model, quantize_strategy = quantization_strategy, target = 'DPUCZDX8G_ISA1_B4096', target_type = 'name')
quantizer.set_quantize_strategy(new_quantize_strategy="./quant_strategy.json")
quantized_model = quantizer.quantize_model(calib_dataset=calib_generator)

What it puzzles me is that when I dump the quantization strategy file after finishing the quantization process, I see that the new json contains the changes I want (attached .json).

quantizer.dump_quantize_strategy(dump_file = dump_quantization_file, verbose=2)

It is also important to note that I am able to override some default configurations such as include_cle (I can set it to False and it works) and fold_conv_bn (I can set it to False too), but not the ones related to, for example, quantization method (0, 1, 2 or 3).

The obtained results suggest that Netron is correctly describing the quantization strategy that has been applied as results do not change when using method 0 or 1. Anyway, it could be possible that for a certain model both method output the same results.

Steps to Reproduce:

1.- Define a basic DNN model. It is not necessary for it to be trained. 2.- Perform a simple Post Training Quantization (also known as Quantize Calibration) as shown above. 3.- Visualize the model in Netron and observe that, for example, method property for either the Conv2D layer or the activation layer has not changed. 4.- Dump the quantization strategy and observe that the json correctly contains the changes that cannot be seen in Netron App.

Expected Behaviour

I expected to observe in Netron that the quantization method had changed.

Actual Behaviour

I observed that the quantization method is still the same.

Docker Image

GPU image for TensorFlow2 built locally being the repository in this commit.

Has somebody faced a similar issue before? Any kind of help would be highly appreciated.