Closed LuoPeng-CV closed 1 year ago
I suggest that you can try the solution in " #572 " I will be glad if you give feedback after trying it.
你的网络和checkpoint不对应,一个是有BN的一个没BN。查一下哪里融合了吧
Missing key(s) in state_dict: "backbone.ERBlock_5.2.cv1.bn.weight", "backbone.ERBlock_5.2.cv1.bn.bias", "backbone.ERBlock_5.2.cv1.bn.running_mean", "backbone.ERBlock_5.2.cv1.bn.running_var",
Unexpected key(s) in state_dict: "backbone.ERBlock_5.2.cv1.conv.bias",
不要用sensitivity_analyse.py生成calib_pt,训练时使用ptq方式生成即可
@LuoPeng-CV How to use ptq.py to build? sensitivity_analyse.py calls the function in ptq.py
@barathsku I'm sorry I was wrong, you can get the calib pt by adding '--quant' and '--calib' when training, by this way you won't get the mistake above.
Before Asking
[X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
[X] I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
Question
您好,在进行qat训练时出现以下报错:
Skip Layer detect.proj_conv Insert fakequant after upsample Traceback (most recent call last): File "tools/train.py", line 126, in <module> main(args) File "tools/train.py", line 111, in main trainer = Trainer(args, cfg, device) File "/home/novasky/lp/YOLOv6-main/yolov6/core/engine.py", line 56, in __init__ self.quant_setup(model, cfg, device) File "/home/novasky/lp/YOLOv6-main/yolov6/core/engine.py", line 537, in quant_setup model.load_state_dict(torch.load(cfg.qat.calib_pt)['model'].float().state_dict()) File "/home/novasky/anaconda3/envs/novasky/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Model: Missing key(s) in state_dict: "backbone.ERBlock_5.2.cv1.bn.weight", "backbone.ERBlock_5.2.cv1.bn.bias", "backbone.ERBlock_5.2.cv1.bn.running_mean", "backbone.ERBlock_5.2.cv1.bn.running_var", "backbone.ERBlock_5.2.cv2.bn.weight", "backbone.ERBlock_5.2.cv2.bn.bias", "backbone.ERBlock_5.2.cv2.bn.running_mean", "backbone.ERBlock_5.2.cv2.bn.running_var", "neck.reduce_layer0.bn.weight", "neck.reduce_layer0.bn.bias", "neck.reduce_layer0.bn.running_mean", "neck.reduce_layer0.bn.running_var", "neck.reduce_layer1.bn.weight", "neck.reduce_layer1.bn.bias", "neck.reduce_layer1.bn.running_mean", "neck.reduce_layer1.bn.running_var", "neck.downsample2.bn.weight", "neck.downsample2.bn.bias", "neck.downsample2.bn.running_mean", "neck.downsample2.bn.running_var", "neck.downsample1.bn.weight", "neck.downsample1.bn.bias", "neck.downsample1.bn.running_mean", "neck.downsample1.bn.running_var", "detect.stems.0.bn.weight", "detect.stems.0.bn.bias", "detect.stems.0.bn.running_mean", "detect.stems.0.bn.running_var", "detect.stems.1.bn.weight", "detect.stems.1.bn.bias", "detect.stems.1.bn.running_mean", "detect.stems.1.bn.running_var", "detect.stems.2.bn.weight", "detect.stems.2.bn.bias", "detect.stems.2.bn.running_mean", "detect.stems.2.bn.running_var", "detect.cls_convs.0.bn.weight", "detect.cls_convs.0.bn.bias", "detect.cls_convs.0.bn.running_mean", "detect.cls_convs.0.bn.running_var", "detect.cls_convs.1.bn.weight", "detect.cls_convs.1.bn.bias", "detect.cls_convs.1.bn.running_mean", "detect.cls_convs.1.bn.running_var", "detect.cls_convs.2.bn.weight", "detect.cls_convs.2.bn.bias", "detect.cls_convs.2.bn.running_mean", "detect.cls_convs.2.bn.running_var", "detect.reg_convs.0.bn.weight", "detect.reg_convs.0.bn.bias", "detect.reg_convs.0.bn.running_mean", "detect.reg_convs.0.bn.running_var", "detect.reg_convs.1.bn.weight", "detect.reg_convs.1.bn.bias", "detect.reg_convs.1.bn.running_mean", "detect.reg_convs.1.bn.running_var", "detect.reg_convs.2.bn.weight", "detect.reg_convs.2.bn.bias", "detect.reg_convs.2.bn.running_mean", "detect.reg_convs.2.bn.running_var". Unexpected key(s) in state_dict: "backbone.ERBlock_5.2.cv1.conv.bias", "backbone.ERBlock_5.2.cv2.conv.bias", "neck.reduce_layer0.conv.bias", "neck.reduce_layer1.conv.bias", "neck.downsample2.conv.bias", "neck.downsample1.conv.bias", "detect.stems.0.conv.bias", "detect.stems.1.conv.bias", "detect.stems.2.conv.bias", "detect.cls_convs.0.conv.bias", "detect.cls_convs.1.conv.bias", "detect.cls_convs.2.conv.bias", "detect.reg_convs.0.conv.bias", "detect.reg_convs.1.conv.bias", "detect.reg_convs.2.conv.bias". Skip Layer detect.proj_conv
训练指令为:
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 tools/train.py --data sp500rvf.yaml --output-dir ./runs/opt_train_v6n_qat --conf configs/repopt/yolov6n_opt_qat.py --quant --distill --distill_feat --batch 128 --epochs 10 --workers 8 --teacher_model_path runs/train/yolov6s_repopt/weights/best_ckpt.pt --device 0,1 --name v6n_kd_qat
其中,yolov6n_opt_qat中的calib_pt是由sensitivity_analyse.py生成的:
python3 sensitivity_analyse.py --weights ../../runs/train/yolov6n_repopt/weights/best_ckpt.pt --batch-size 32 --batch-number 4 --data-root ~/Dataset/v6dataset/ --img-size 640 --data-yaml ../../sp500rvf.yaml --eval-yaml eval.yaml
能给我一些建议吗,谢谢。
Additional
No response