Closed LuoPeng-CV closed 1 year ago
@LuoPeng-CV It has nothing do to with TRT warnings, we have fixed the bug of visualization, you can refer to #610 and use following command to visualize your quantized model
python3 visualize.py --imgs-dir ../../../../dataset/coco/images/val2017/ --img-size 640 -m ../../../yolov6s_v2_reopt_qat_43.0_remove_qdq_bs1.sim.int8.trt --conf-thres 0.2 --iou-thres 0.03 --visual-dir yolov6s_visual_out_int8_01
Before Asking
- [x] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
- [x] I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)
- [x] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
- [x] I have searched the YOLOv6 issues and found no similar questions.
Question
如下图所示分别为qat训练完成后与将其进行tensorrt int8转化后的检测结果对比:
看到官方提供的qat onnx中有remove qdq的操作, 猜测没有做这个操作可能会引起上述问题? 同时在onnx转trt时出现warning: some weights are outside of int8_t range and will be clipped to int8_t range;
请问要解决上述问题, 能否提供一些建议?
Additional
No response
我在repopt后进行PTQ已经失败了,不知道怎么解决,可以提供成功经验的吗? 具体操作:当我使用自己的数据以及opt权重,并从说明文档中下载了yolov6s_v2_scale.pt,执行PTQ操作:python tools/train_qat.py --data ./data/dataset.yaml --conf configs/repopt/yolov6s_opt_qat.py --quant --calib 错误提示:RuntimeError: Calibrator returned None. This usually happens when calibrator hasn't seen any tensor. Passing 'strict=False' to load_calib_amax() will ignore the error.
@zahidzqj 配置文件里pretrained加载了权重吗
@LuoPeng-CV 确认使用了
@zahidzqj
# YOLOv6s model model = dict( type='YOLOv6s', pretrained='/media/luopeng/E/code/YOLO/YOLOv6/runs/train/yolov6s_repopt/weights/best_ckpt.pt', scales='/media/luopeng/E/code/YOLO/YOLOv6/yolov6s_v2_scale.pt', depth_multiple=0.33, width_multiple=0.50, backbone=dict( type='EfficientRep', num_repeats=[1, 6, 12, 18, 6], out_channels=[64, 128, 256, 512, 1024], ), neck=dict( type='RepPANNeck', num_repeats=[12, 12, 12, 12], out_channels=[256, 128, 128, 256, 256, 512], ), head=dict( type='EffiDeHead', in_channels=[128, 256, 512], num_layers=3, begin_indices=24, anchors=1, out_indices=[17, 20, 23], strides=[8, 16, 32], iou_type = 'giou', use_dfl = False, reg_max = 0, # if use_dfl is False, please set reg_max to 0 distill_weight={ 'class': 1.0, 'dfl': 1.0, }, ) )
solver = dict( optim='SGD', lr_scheduler='Cosine', lr0=0.00001, lrf=0.001, momentum=0.937, weight_decay=0.00005, warmup_epochs=3, warmup_momentum=0.8, warmup_bias_lr=0.1 )
data_aug = dict( hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, )
ptq = dict( num_bits = 8, calib_batches = 4,
calib_method = 'histogram',
# 'entropy', 'percentile', 'mse'
histogram_amax_method='entropy',
histogram_amax_percentile=99.99,
calib_output_path='./',
sensitive_layers_skip=False,
sensitive_layers_list=[],
)
qat = dict( calib_pt = '/media/luopeng/E/code/YOLO/YOLOv6/assets/yolov6s_calib.pt', sensitive_layers_skip = False, sensitive_layers_list=['detect.stems.0.conv', 'detect.stems.1.conv', 'detect.stems.2.conv', 'detect.cls_convs.0.conv', 'detect.cls_convs.1.conv', 'detect.cls_convs.2.conv', 'detect.reg_convs.0.conv', 'detect.reg_convs.1.conv', 'detect.reg_convs.2.conv', 'detect.cls_preds.0', 'detect.cls_preds.1', 'detect.cls_preds.2', 'detect.reg_preds.0', 'detect.reg_preds.1', 'detect.reg_preds.2', ], )
# Choose Rep-block by the training Mode, choices=["repvgg", "hyper-search", "repopt"] training_mode='repopt'
你可以对比一下.
@LuoPeng-CV 看起来没有什么差异,难道是我的权重有问题
model = dict( type='YOLOv6s', pretrained='/data/YOLOv6/runs/train/exp7/weights/best_ckpt.pt', scales='./assets/yolov6s_v2_scale.pt', depth_multiple=0.33, width_multiple=0.50, backbone=dict( type='EfficientRep', num_repeats=[1, 6, 12, 18, 6], out_channels=[64, 128, 256, 512, 1024], ), neck=dict( type='RepPANNeck', num_repeats=[12, 12, 12, 12], out_channels=[256, 128, 128, 256, 256, 512], ), head=dict( type='EffiDeHead', in_channels=[128, 256, 512], num_layers=3, begin_indices=24, anchors=1, out_indices=[17, 20, 23], strides=[8, 16, 32], iou_type = 'giou', use_dfl = False, reg_max = 0, # if use_dfl is False, please set reg_max to 0 distill_weight={ 'class': 1.0, 'dfl': 1.0, }, ) )
solver = dict( optim='SGD', lr_scheduler='Cosine', lr0=0.00001, lrf=0.001, momentum=0.937, weight_decay=0.00005, warmup_epochs=3, warmup_momentum=0.8, warmup_bias_lr=0.1 )
data_aug = dict( hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, )
ptq = dict( num_bits = 8, calib_batches = 4,
calib_method = 'histogram',
# 'entropy', 'percentile', 'mse'
histogram_amax_method='entropy',
histogram_amax_percentile=99.99,
calib_output_path='./',
sensitive_layers_skip=False,
sensitive_layers_list=['detect.stems.0.conv',
'detect.stems.1.conv',
'detect.stems.2.conv',
'detect.cls_convs.0.conv',
'detect.cls_convs.1.conv',
'detect.cls_convs.2.conv',
'detect.reg_convs.0.conv',
'detect.reg_convs.1.conv',
'detect.reg_convs.2.conv',
'detect.cls_preds.0',
'detect.cls_preds.1',
'detect.cls_preds.2',
'detect.reg_preds.0',
'detect.reg_preds.1',
'detect.reg_preds.2',
],
)
qat = dict( calib_pt = './assets/yolov6s_v2_reopt_43.1_calib_histogram.pt', sensitive_layers_skip = False, sensitive_layers_list=['detect.stems.0.conv', 'detect.stems.1.conv', 'detect.stems.2.conv', 'detect.cls_convs.0.conv', 'detect.cls_convs.1.conv', 'detect.cls_convs.2.conv', 'detect.reg_convs.0.conv', 'detect.reg_convs.1.conv', 'detect.reg_convs.2.conv', 'detect.cls_preds.0', 'detect.cls_preds.1', 'detect.cls_preds.2', 'detect.reg_preds.0', 'detect.reg_preds.1', 'detect.reg_preds.2', ], )
‘’Choose Rep-block by the training Mode, choices=["repvgg", "hyper-search", "repopt"]‘’ training_mode='repopt'
@LuoPeng-CV 训练V6-RepOpt流程:(自己的数据集) 修改权重:yolov6s_opt.py中scales='/data/YOLOv6/assets/yolov6s_v2_scale.pt',(权重为官网下载) 训练代码:python tools/train.py --conf configs/repopt/yolov6s_opt.py 官网中提示的步骤1hyperparameter search,不是必须了吧?
@zahidzqj 不是必须,用作者提供的即可
pull 2022.12.08最新代码加入了onnx的remove qdq,重新导出qat onnx再转trt,框正常.
close...
这里看个到了代码中关于remove_qdq的操作: https://github.com/meituan/YOLOv6/commit/54deb7b233e322d9f57ba1c723b7013c53dd08eb#diff-0e63f33ad384f9d1c2694580813fca6e9fb69e5fae1b8e1c30e896e3484b8b2e 将该部分代码加入工程,然后检测框就正常了。 代码推理速度很快, batch为4,尺寸为736时,推理仅需4ms多,但是主要耗时出现的to(device),需要40~50ms。
Before Asking
[X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
[X] I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
Question
如下图所示分别为qat训练完成后与将其进行tensorrt int8转化后的检测结果对比:
看到官方提供的qat onnx中有remove qdq的操作, 猜测没有做这个操作可能会引起上述问题? 同时在onnx转trt时出现warning: some weights are outside of int8_t range and will be clipped to int8_t range;
请问要解决上述问题, 能否提供一些建议?
Additional
No response