Yolov5 Int8 export in PyTorch

Praveen-mvp commented 2 months ago

Search before asking

[X] I have searched the YOLOv5 issues and found no similar feature requests.

Description

Exporting my model into PyTorch itself

Use case

Hi Team , I would like to have code or command for exporting my model in Int8 but in Pytorch istself , So is there any way or code for doing it , Bzc i can able to save my model in pytorch itself on YoloV8 by

torch.save(model.export(format="onnx", int8=True), 'yolo-quant.pt')

But i tried these same step on YoloV5 but i can't able to convert it. So Please share your knoweledge regarding this.

Additional

No response

Are you willing to submit a PR?

[X] Yes I'd like to help by submitting a PR!

glenn-jocher commented 2 months ago

@Praveen-mvp thank you for your interest. Currently, YOLOv5 does not support direct INT8 export in PyTorch. You might consider exporting to ONNX first and then using tools like TensorRT for INT8 quantization. For more details, please refer to our model export tutorial.

Praveen-mvp commented 2 months ago

@glenn-jocher, got it. Thanks for the support. I also have a question regarding an issue. I tried static quantization on my custom-trained model, but I'm encountering an error during inference

AttributeError: 'Conv2d' object has no attribute '_modules'

Please share your thoughts on this.

glenn-jocher commented 2 months ago

@Praveen-mvp it looks like the error might be related to the model's layers not being compatible with static quantization. Ensure you are using the latest version of YOLOv5 and PyTorch. If the issue persists, please provide more details about your quantization process and the specific code you are using. This will help us better understand and address the problem.

Praveen-mvp commented 1 month ago

Hi @glenn-jocher,

Thank you for your help and support. I was able to resolve the issue by loading the model directly instead of using the built-in class (DetectMultiBackend).

glenn-jocher commented 1 month ago

@Praveen-mvp great to hear that you resolved the issue by loading the model directly. If you have any further questions or need additional assistance, feel free to ask.

Praveen-mvp commented 1 month ago

Hi @glenn-jocher , Is there any code for doing static quantization for ultralytics/yolov5 model , Am not sure whether the way i quantized my model is correct or not, if yes plz share it , it would be helpfull for us.

And to clarify , we can't use Dynamic Quantization right , bzc Dynamic Quantization doesn't support for conv layers right ?

glenn-jocher commented 1 month ago

Hi @Praveen-mvp, for static quantization, you can follow PyTorch's official static quantization tutorial. Dynamic quantization is indeed not suitable for convolutional layers. If you encounter any issues, please ensure you are using the latest version of YOLOv5 and PyTorch.

Praveen-mvp commented 1 month ago

Hi @glenn-jocher , this is how am doing my quantization

# Load the model
checkpoint = torch.load('yolov5n.pt')
model = checkpoint['model'].float()  # Ensure model is in float32 before quantization

# Set the quantization configuration
model.qconfig = quantization.get_default_qconfig('fbgemm')

# Prepare the model for quantization
model_prepared = quantization.prepare(model)

# Switch model to evaluation mode
model_prepared.eval()

# Perform calibration with dummy data
with torch.no_grad():
    for _ in range(10):  # Replace with actual calibration data
        input_batch = torch.randn(1, 3, 640, 640)  # Adjust to your input size
        model_prepared(input_batch)

# Convert the model to a quantized version
model_quant = quantization.convert(model_prepared)

# Save the quantized model
torch.save({'model': model_quant.state_dict()}, 'yolov5_quantized.pt')

This is how I am quantizing and saving my model. Please verify the code. For loading, we need to set up the model properly. However, when using the quantized model on detect.py, I am encountering layer compatibility issues. I tried loading the model directly, but no detection results are being produced. [Note: This is the pre-trained model from ultralytics]

Could you please help me identify what might be going wrong?

Praveen-mvp commented 1 month ago

Here are the versions I'm using:

Python - 3.8 Ultralytics - 8.2.45 PyTorch - 2.3.1+cu121

glenn-jocher commented 1 month ago

@Praveen-mvp thank you for sharing the versions. Your quantization approach looks mostly correct, but YOLOv5's custom layers might need specific handling. Ensure you are using the latest YOLOv5 version and refer to PyTorch's static quantization tutorial for detailed guidance. If the issue persists, please provide more details about the layer compatibility issues you're encountering.

ultralytics / yolov5