Closed hslee4716 closed 11 months ago
Hi @hslee4716 . I think the code here is wrong. Specifically the order of calls seems to be incorrect:
to_int8_model(model)
load_checkpoint(model, "exp/yolo_nas_M_custom_qat_1212/RUN_20231211_121606_029174/ckpt_best.pth")
You cannot load regular torch weights into quantized model. It messes up model completely. There is no simple way to save quantized model state as you would normally do. After you quantized a model there is only one way to store it - export to ONNX file.
Please check example notebook where we show to do fine-tune and export YoloNAS model end-to-end: https://github.com/Deci-AI/super-gradients/blob/master/notebooks/yolo_nas_custom_dataset_fine_tuning_with_qat.ipynb
Thanks for your reply @BloodAxe , But, when i trying to do QAT for YoloNas model according to the code below, only quantized weights are saved. Therefore, i can load weights only after model has been quantized
from super_gradients.training.datasets.detection_datasets.coco_format_detection import COCOFormatDetectionDataset
from super_gradients.training.transforms.transforms import (
DetectionMosaic,
DetectionRandomAffine,
DetectionHSV,
DetectionHorizontalFlip,
DetectionPaddedRescale,
DetectionStandardize,
DetectionTargetsFormatTransform,
)
from super_gradients.training.datasets.datasets_utils import worker_init_reset_seed
from super_gradients.training import Trainer
from super_gradients.common.object_names import Models
from super_gradients.training import models
from torch.utils.data import DataLoader
from super_gradients.training import Trainer
from super_gradients.common.object_names import Models
from super_gradients.training import models
from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback
from super_gradients.training.utils.detection_utils import DetectionCollateFN
from super_gradients.training.pre_launch_callbacks import modify_params_for_qat
import warnings
warnings.filterwarnings('ignore')
input_size = (1280,1280)
batch_size = 2
num_workers = 12
train_dataset_params = dict(
data_dir="/datasets/ver5",
images_dir="/datasets/ver5/images/train",
json_annotation_file="/datasets/ver5/annotations/train.json",
input_dim=input_size,
ignore_empty_annotations=False,
with_crowd=False,
all_classes_list=['person', 'car'],
transforms=[
DetectionMosaic(prob=1., input_dim=input_size),
DetectionRandomAffine(degrees=0.0, scales=(0.5, 1.5), shear=0.0, target_size=input_size, filter_box_candidates=False, border_value=128),
DetectionHSV(prob=1.0, hgain=5, vgain=30, sgain=30),
DetectionHorizontalFlip(prob=0.5),
DetectionPaddedRescale(input_dim=input_size),
DetectionStandardize(max_value=255),
DetectionTargetsFormatTransform(input_dim=input_size, output_format="LABEL_CXCYWH"),
],
)
val_dataset_params = dict(
data_dir=/datasets/ver5",
images_dir="/datasets/ver_5/images/val",
json_annotation_file="/datasets/ver5/annotations/val.json",
input_dim=input_size,
ignore_empty_annotations=False,
with_crowd=False,
all_classes_list=['person', 'car'],
transforms=[
DetectionPaddedRescale(input_dim=input_size, max_targets=300),
DetectionStandardize(max_value=255),
DetectionTargetsFormatTransform(input_dim=input_size, output_format="LABEL_CXCYWH"),
],
)
train_dataloader_params = {
"shuffle": True,
"batch_size": batch_size,
"drop_last": True,
"pin_memory": True,
"collate_fn": DetectionCollateFN(),
"worker_init_fn": worker_init_reset_seed,
"num_workers": num_workers,
"persistent_workers": True,
}
val_dataloader_params = {
"shuffle": False,
"batch_size": batch_size,
"drop_last": False,
"pin_memory": True,
"collate_fn": DetectionCollateFN(),
"worker_init_fn": worker_init_reset_seed,
"num_workers": num_workers,
"persistent_workers": True,
}
train_params = {
"warmup_initial_lr": 1e-6,
"initial_lr": 5e-4,
"lr_mode": "cosine",
"cosine_final_lr_ratio": 0.1,
"optimizer": "AdamW",
"zero_weight_decay_on_bias_and_bn": True,
"lr_warmup_epochs": 3,
"warmup_mode": "LinearEpochLRWarmup",
"optimizer_params": {"weight_decay": 0.0001},
"ema": True,
"ema_params": {"beta": 25, "decay_type": "exp"},
"max_epochs": 300,
"mixed_precision": True,
"loss": PPYoloELoss(use_static_assigner=False, num_classes=2, reg_max=16),
"valid_metrics_list": [
DetectionMetrics_050(
score_thres=0.1,
top_k_predictions=300,
num_cls=2,
normalize_targets=True,
include_classwise_ap=True,
class_names=['person', 'car'],
post_prediction_callback=PPYoloEPostPredictionCallback(score_threshold=0.01, nms_top_k=1000, max_predictions=300, nms_threshold=0.7),
)
],
"metric_to_watch": "mAP@0.50",
}
train_params, train_dataset_params, val_dataset_params, train_dataloader_params, val_dataloader_params = modify_params_for_qat(
train_params, train_dataset_params, val_dataset_params, train_dataloader_params, val_dataloader_params
)
trainset = COCOFormatDetectionDataset(**train_dataset_params)
valset = COCOFormatDetectionDataset(**val_dataset_params)
train_loader = DataLoader(trainset, **train_dataloader_params)
valid_loader = DataLoader(valset, **val_dataloader_params)
trainer = Trainer(experiment_name="yolo_nas_M_custom_qat", ckpt_root_dir="experiments")
model = models.get(Models.YOLO_NAS_M, num_classes=2, pretrained_weights="coco")
model.cuda()
trainer.qat(model=model, training_params=train_params,
train_loader=train_loader, valid_loader=valid_loader, calib_loader=train_loader)
When I try to load the model as reference code,
model = models.get(
Models.YOLO_NAS_M,
num_classes=2,
checkpoint_num_classes=2,
checkpoint_path="weights/best.pth"
)
The error below occurs
/mnt/nas/super-gradients/src/super_gradients/training/utils/checkpoint_utils.py", line 212, in __call__
raise ValueError(f"ckpt layer {ckpt_key} with shape {ckpt_val.shape} does not match {model_key}" f" with shape {model_val.shape} in the model")
ValueError: ckpt layer backbone.stem.conv.post_bn.weight with shape torch.Size([48]) does not match backbone.stem.conv.branch_3x3.conv.weight with shape torch.Size([48, 3, 3, 3]) in the model
And the keys of the loaded weights are as follows.
backbone.stem.conv.post_bn.weight
backbone.stem.conv.post_bn.bias
backbone.stem.conv.post_bn.running_mean
backbone.stem.conv.post_bn.running_var
backbone.stem.conv.post_bn.num_batches_tracked
backbone.stem.conv.rbr_reparam.weight
backbone.stem.conv.rbr_reparam.bias
backbone.stem.conv.rbr_reparam._input_quantizer._amax
backbone.stem.conv.rbr_reparam._weight_quantizer._amax
backbone.stage1.downsample.post_bn.weight
backbone.stage1.downsample.post_bn.bias
backbone.stage1.downsample.post_bn.running_mean
backbone.stage1.downsample.post_bn.running_var
backbone.stage1.downsample.post_bn.num_batches_tracked
backbone.stage1.downsample.rbr_reparam.weight
backbone.stage1.downsample.rbr_reparam.bias
backbone.stage1.downsample.rbr_reparam._input_quantizer._amax
.
.
.
.
heads.head3.reg_convs.0.seq.conv._weight_quantizer._amax
heads.head3.reg_convs.0.seq.bn.weight
heads.head3.reg_convs.0.seq.bn.bias
heads.head3.reg_convs.0.seq.bn.running_mean
heads.head3.reg_convs.0.seq.bn.running_var
heads.head3.reg_convs.0.seq.bn.num_batches_tracked
heads.head3.cls_pred.weight
heads.head3.cls_pred.bias
heads.head3.cls_pred._input_quantizer._amax
heads.head3.cls_pred._weight_quantizer._amax
heads.head3.reg_pred.weight
heads.head3.reg_pred.bias
heads.head3.reg_pred._input_quantizer._amax
heads.head3.reg_pred._weight_quantizer._amax
Sorry for the confusion, there are too many methods and not all of the have been updated to recent model.export API.
Let's start with model.export
itself.
This is a new and recommended way to export a model to ONNX that gives you the most control on how you want to export a model. In fact you can specify that you want to export INT8 quantized model. In this you don't have to do PTQ manually. A model.export(..., quantization_mode=ExportQuantizationMode.INT8)
is enough. You just pass the regular trained model with FP32 weights and it will be quantized inside. Yes, you're not getting QAT here, but you still can pass dataloader for model calibration if you like.
So in short:
trainer.train(model=my_model, ...)
my_model.export("my_model.onnx", quantization_mode=ExportQuantizationMode.INT8, postprocessing=True)
Please note, my_model.export
will NOT modify the instance of my_model
object. This instance will be intact, with all the weights and model state kept as before the call. This is done on-purpose to avoid unwanted side-effects.
2) Second option is to use trainer.ptq
. This method does similar job, it quantize and export a model.
If the model supports a new export API it will use it, otherwise it will export model using old export routines (Relevant of classification & segmentation models)
The method will change the model state. It will save the exported ONNX file to experiment directory and compute metrics on the validation dataloader for PTQ model. You don't get much of the control on the exported ONNX when using this method - you can't specify export format (BATCH/FLAT) or tune postprocessing.
1) trainer.qat
performs PTQ (post-training quantization) and QAT (quantization aware training) AND model export to ONNX. This method currently don't support use of model.export
and thus exported model in this case will not have postprocessing step. It will also change the model state but it allows you to 'tune' a model and slightly increase the accuracy of the exported model.
After calling trainer.ptq
or trainer.qat
you should be able to call my_model.export("my_model.onnx", quantization_mode=ExportQuantizationMode.INT8, postprocessing=True)
and set whatever postprocessing you like. Just not try to load some weights there, after QAT the model already has right state for export.
💡 Your Question
Im successfully train yolo_nas_m model in custom dataset with qat, and export model with int8-onnx, The onnx inference results are normal. but, when i export onnx to tensorrt engine, the results is abnormal.(nothing detected)
Here's my codes, is there any problem?
Versions