INT8 Quantization of YOLO-World

MaverickPigoo commented 1 month ago

Is the newly released 'TFLite Export with INT8 Quantization' only quantize the yolov8 backbone（or image encoder）? I note that you emphasis on 'Please use Reparameterized YOLO-World for TFLite!!' ， is that mean the quantization of text encoder is not supported?
How is the detection performance of the int8 quantized model, and is there any corresponding metric?
Will you consider adding the spport of quantization with pytorch quantization api?

wondervictor commented 1 month ago

Hi @MaverickPigoo, thanks for your interest! For your questions:

Is the newly released 'TFLite Export with INT8 Quantization' only quantize the yolov8 backbone（or image encoder）? I note that you emphasise on 'Please use Reparameterized YOLO-World for TFLite!!'， is that mean the quantization of text encoder is not supported?

The TFLite INT8 Quantization only supports Reparameterized YOLO-World including the image backbone, neck, and the detention head, but without the text encoder.

How is the detection performance of the int8 quantized model, and is there any corresponding metric?

The INT8 Quantization has a slight performance drop, about 1.0 AP on COCO. The results of FP16 models and INT8 models are consistent. BTW, quantization-aware training (QAT) will bridge the gap.

Will you consider adding the support of quantization with pytorch quantization API?

Sure, and we're working on it. In the next plans, we will release TensorRT with quantization and quantization-aware training.

MaverickPigoo commented 2 weeks ago

For reparameterized module, if the text prompts are changed, which means the parameters of 1*1 conv changed. Does this part of the parameter need to be recalibrated? If recalibration is needed, is it for the neck section alone?

AILab-CVC / YOLO-World

INT8 Quantization of YOLO-World #339