How to deal with QuantizeLinear and DequantizeLinear node when I do qutization using openvino/tnn/mnn?

I train a x2 model, and after that I finetune it using QAT training using below command:

"python train.py --opt options/train/base7_qat.yaml --name base7_D4C28_bs16ps64_lr12-3_qat_x2 --scale 2 --bs 16 --ps 64 --lr 1e-3 --gpu_ids 1 --qat --qat_path experiment/ base7_D4C28_bs16ps64_lr12-3_x2/best_status".

and then I convert it to ONNX model using below cammand: "python -m tf2onnx.convert --saved-model

./experiment/base7_D4C28_bs16ps128_lr1e-3_x2_20210603/best_status --opset 13 --output ./ONNX/base7_D4C28_bs16ps128_lr1e-3_x2_20210603.onnx"

then I open this onnx model using netron:

无标题

I want to do qutization using openvino/tnn/mnn for this onnx model, my question is do I need to remove the QuantizeLinear and DequantizeLinear in red box first and then do qutization?

or should I just do qutization, and the openvino/tnn/mnn will remove it automatically?

and I also check the tflite model(using generate_tflite.py) --> onnx model, it seems the quantizated tflite model/onnx model contains node QuantizeLinear and DequantizeLinear, is it normal?

NJU-Jet / SR_Mobile_Quantization

How to deal with QuantizeLinear and DequantizeLinear node when I do qutization using openvino/tnn/mnn? #10