Open MaverickPigoo opened 1 month ago
Hi @MaverickPigoo, thanks for your interest! For your questions:
- Is the newly released 'TFLite Export with INT8 Quantization' only quantize the yolov8 backbone(or image encoder)? I note that you emphasise on 'Please use Reparameterized YOLO-World for TFLite!!', is that mean the quantization of text encoder is not supported?
The TFLite INT8 Quantization only supports Reparameterized YOLO-World
including the image backbone, neck, and the detention head, but without the text encoder.
- How is the detection performance of the int8 quantized model, and is there any corresponding metric?
The INT8 Quantization has a slight performance drop, about 1.0 AP on COCO. The results of FP16 models and INT8 models are consistent. BTW, quantization-aware training (QAT) will bridge the gap.
- Will you consider adding the support of quantization with pytorch quantization API?
Sure, and we're working on it. In the next plans, we will release TensorRT with quantization and quantization-aware training.
For reparameterized module, if the text prompts are changed, which means the parameters of 1*1 conv changed. Does this part of the parameter need to be recalibrated? If recalibration is needed, is it for the neck section alone?