meituan / YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.
GNU General Public License v3.0
5.72k stars 1.03k forks source link

How to Use QAT for Segmentation with YOLOv6? #1055

Open hamedgorji opened 5 months ago

hamedgorji commented 5 months ago

Search before asking

Description

Hi YOLOv6 Team,

I am currently working on a project that requires Quantization-Aware Training (QAT) for segmentation tasks using YOLOv6. I noticed that configurations like yolov6n_hs, yolov6n_opt, and yolov6n_opt_qat are available for detection but not for segmentation.

To achieve QAT for segmentation, should I add the following configurations at the end of my config file:

ptq = dict(
    num_bits = 8,
    calib_batches = 4,
    # 'max', 'histogram'
    calib_method = 'max',
    # 'entropy', 'percentile', 'mse'
    histogram_amax_method='entropy',
    histogram_amax_percentile=99.99,
    calib_output_path='./',
    sensitive_layers_skip=False,
    sensitive_layers_list=[],
)

qat = dict(
    calib_pt = './assets/v6s_n_calib_max.pt',
    sensitive_layers_skip = False,
    sensitive_layers_list=[],
)

# Choose Rep-block by the training mode, choices=["repvgg", "hyper-search", "repopt"]
training_mode='repopt'

Could you please guide me on the correct approach to enable QAT for segmentation tasks? Are there any example configurations or guidelines available for integrating QAT with segmentation in YOLOv6?

Thank you.

Use case

No response

Additional

No response

Are you willing to submit a PR?

hamedgorji commented 5 months ago

Update: I changed my config to:

# YOLOv6n-seg model
model = dict(
    type='YOLOv6n',
    pretrained='D:/YOLOv6-seg/assets/pretrained_opt.pt',
    scales='D:/YOLOv6-seg/assets/scale.pt',
    depth_multiple=0.33,
    width_multiple=0.25,
    backbone=dict(
        type='EfficientRep',
        num_repeats=[1, 6, 12, 18, 6],
        out_channels=[64, 128, 256, 512, 1024],
        fuse_P2=True,
        cspsppf=True,
        ),
    neck=dict(
        type='RepBiFPANNeck',
        num_repeats=[12, 12, 12, 12],
        out_channels=[256, 128, 128, 256, 256, 512],
        ),
    head=dict(
        type='EffiDeHead',
        in_channels=[128, 256, 512],
        num_layers=3,
        begin_indices=24,
        npr=256,
        nm=32,
        isseg=True,
        issolo=False,
        anchors=3,
        anchors_init=[[10,13, 19,19, 33,23],
                      [30,61, 59,59, 59,119],
                      [116,90, 185,185, 373,326]],
        out_indices=[17, 20, 23],
        strides=[8, 16, 32],
        atss_warmup_epoch=0,
        iou_type='siou',
        use_dfl=False, # set to True if you want to further train with distillation
        reg_max=0, # set to 16 if you want to further train with distillation
        distill_weight={
            'class': 1.0,
            'dfl': 1.0,
        },
    )
)

solver = dict(
    optim='SGD',
    lr_scheduler='Cosine',
    lr0=0.02,
    lrf=0.01,
    momentum=0.937,
    weight_decay=0.001,
    warmup_epochs=3.0,
    warmup_momentum=0.8,
    warmup_bias_lr=0.1
)

data_aug = dict(
    hsv_h=0.015,
    hsv_s=0.7,
    hsv_v=0.4,
    degrees=0.0,
    translate=0.1,
    scale=0.5,
    shear=0.0,
    flipud=0.0,
    fliplr=0.5,
    mosaic=1.0,
    mixup=0.0,
)

ptq = dict(
    num_bits = 8,
    calib_batches = 4,
    # 'max', 'histogram'
    calib_method = 'max',
    # 'entropy', 'percentile', 'mse'
    histogram_amax_method='entropy',
    histogram_amax_percentile=99.99,
    calib_output_path='./',
    sensitive_layers_skip=False,
    sensitive_layers_list=[],
)

qat = dict(
    calib_pt = './assets/v6n_calib_max.pt',
    sensitive_layers_skip = False,
    sensitive_layers_list=[],
)
# Choose Rep-block by the training Mode, choices=["repvgg", "hyper-search", "repopt"]
training_mode='repopt'

I then ran the following command:

python tools/train.py --data data/data.yaml --output-dir ./runs/train_im256_30636_qat --conf configs/yolov6n_seg_opt_qat.py --quant --distill --distill_feat --batch 32 --epochs 10 --workers 32 --teacher_model_path "D:/YOLOv6-seg/assets/pretrained_opt.pt" --device 0

But it loaded the model first and at the end gave me this error:

Skip Layer detect.proj_conv
Traceback (most recent call last):
  File "D:\YOLOv6-seg\tools\train.py", line 142, in <module>
    main(args)
  File "D:\YOLOv6-seg\tools\train.py", line 127, in main
    trainer = Trainer(args, cfg, device)
  File "D:\YOLOv6-seg\yolov6\core\engine.py", line 68, in __init__
    self.quant_setup(model, cfg, device)
  File "D:\YOLOv6-seg\yolov6\core\engine.py", line 602, in quant_setup
    model.neck.upsample_enable_quant(cfg.ptq.num_bits, cfg.ptq.calib_method)
  File "C:\Users\Hamed\miniconda3\envs\yolov6\lib\site-packages\torch\nn\modules\module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'RepBiFPANNeck' object has no attribute 'upsample_enable_quant'

I got this error for both PTQ and QAT.

hamedgorji commented 5 months ago

Update2: I fixed the above error by adding the following function to RepBiFPANNeck class

    def upsample_enable_quant(self, num_bits, calib_method):
        print("Insert fakequant after upsample")
        from pytorch_quantization import nn as quant_nn
        from pytorch_quantization.tensor_quant import QuantDescriptor
        conv2d_input_default_desc = QuantDescriptor(num_bits=num_bits, calib_method=calib_method)
        self.upsample_feat0_quant = quant_nn.TensorQuantizer(conv2d_input_default_desc)
        self.upsample_feat1_quant = quant_nn.TensorQuantizer(conv2d_input_default_desc)
        self._QUANT = True

But now I get another error regarding calib max when I try to do PTQ

python tools/train.py --data data/data.yaml --output-dir ./runs/train_im256_30636_ptq --conf configs/yolov6n_seg_opt_qat.py --quant --calib --batch 16 --workers 0 --device 0

Traceback (most recent call last):
  File "D:\YOLOv6-seg\tools\train.py", line 142, in <module>
    main(args)
  File "D:\YOLOv6-seg\tools\train.py", line 130, in main
    trainer.calibrate(cfg)
  File "D:\YOLOv6-seg\yolov6\core\engine.py", line 592, in calibrate
    ptq_calibrate(self.model, self.train_loader, cfg)
  File "D:\YOLOv6-seg\tools\qat\qat_utils.py", line 61, in ptq_calibrate
    compute_amax(model, method=cfg.ptq.histogram_amax_method, percentile=cfg.ptq.histogram_amax_percentile)
  File "D:\YOLOv6-seg\tools\qat\qat_utils.py", line 47, in compute_amax
    module.load_calib_amax()
  File "C:\Users\Hamed\miniconda3\envs\yolov6\lib\site-packages\pytorch_quantization\nn\modules\tensor_quantizer.py", line 237, in load_calib_amax
    raise RuntimeError(err_msg + " Passing 'strict=False' to `load_calib_amax()` will ignore the error.")
RuntimeError: Calibrator returned None. This usually happens when calibrator hasn't seen any tensor. Passing 'strict=False' to `load_calib_amax()` will ignore the error.
hamedgorji commented 5 months ago

@Chilicyy Any thoughts on this?

hamedgorji commented 5 months ago

Update 3: As I mentioned above during the PTQ process, I encountered a new error related to calibration maximum (calib max). Specifically, the error message indicated that the calibrator returned None, suggesting it hasn't seen any tensor during calibration.

To diagnose this, I added detailed logging and discovered that the neck.upsample_feat0_quant and neck.upsample_feat1_quant layers were encountering issues:

Error for neck.upsample_feat0_quant: Calibrator returned None. This usually happens when calibrator hasn't seen any tensor. Passing 'strict=False' to `load_calib_amax()` will ignore the error.
Loaded calib_amax for neck.upsample_feat0_quant
Error for neck.upsample_feat1_quant: Calibrator returned None. This usually happens when calibrator hasn't seen any tensor. Passing 'strict=False' to `load_calib_amax()` will ignore the error.
Loaded calib_amax for neck.upsample_feat1_quant

It seems that during the calibration phase, these layers are not receiving the expected data, leading to the calibrator returning None.

Maybe the issue is because of adding the following function to the RepBiFPANNeckclass without modifying the forward pass.

    def upsample_enable_quant(self, num_bits, calib_method):
        print("Insert fakequant after upsample")
        # Insert fakequant after upsample op to build TensorRT engine
        from pytorch_quantization import nn as quant_nn
        from pytorch_quantization.tensor_quant import QuantDescriptor
        conv2d_input_default_desc = QuantDescriptor(num_bits=num_bits, calib_method=calib_method)
        self.upsample_feat0_quant = quant_nn.TensorQuantizer(conv2d_input_default_desc)
        self.upsample_feat1_quant = quant_nn.TensorQuantizer(conv2d_input_default_desc)
        # global _QUANT
        self._QUANT = True

Any suggestions?

hamedgorji commented 5 months ago

Update 4: I've made some changes to the RepBiFPANNeckforward function similar to the RepPANNeck, and the problem has been solved.

    def forward(self, input):
        (x3, x2, x1, x0) = input

        fpn_out0 = self.reduce_layer0(x0)
        f_concat_layer0 = self.Bifusion0([fpn_out0, x1, x2])
        if hasattr(self, '_QUANT') and self._QUANT is True:
            f_concat_layer0 = self.upsample_feat0_quant(f_concat_layer0)
        f_out0 = self.Rep_p4(f_concat_layer0)

        fpn_out1 = self.reduce_layer1(f_out0)
        f_concat_layer1 = self.Bifusion1([fpn_out1, x2, x3])
        if hasattr(self, '_QUANT') and self._QUANT is True:
            f_concat_layer1 = self.upsample_feat1_quant(f_concat_layer1)
        pan_out2 = self.Rep_p3(f_concat_layer1)

        down_feat1 = self.downsample2(pan_out2)
        p_concat_layer1 = torch.cat([down_feat1, fpn_out1], 1)
        pan_out1 = self.Rep_n3(p_concat_layer1)

        down_feat0 = self.downsample1(pan_out1)
        p_concat_layer2 = torch.cat([down_feat0, fpn_out0], 1)
        pan_out0 = self.Rep_n4(p_concat_layer2)

        outputs = [pan_out2, pan_out1, pan_out0]

        return outputs
hamedgorji commented 4 months ago

@zhiyelee @cfc4n @yeldarby @rainsun I was finally able to train my model using the QAT approach. However, when I converted it to ONNX using qat_export.py, the model failed to perform any segmentation. This segmentation QAT approach has been quite problematic, and I'm puzzled as to why the authors included this section when it is not fully tested. I spent about three weeks troubleshooting various issues but still couldn't get it to work.

If it had been mentioned that the segmentation model does not support QAT, I could have explored other options instead of losing three weeks of my time.