yang-0201 / YOLOv6_pro

Make it easier for yolov6 to change the network structure
GNU General Public License v3.0
67 stars 15 forks source link

请问为什么开始运行时报这样的错呢:RuntimeError: The size of tensor a (8) must match the size of tensor b (4) at non-singleton dimension 2 #1

Open mmdd1314 opened 1 year ago

mmdd1314 commented 1 year ago

报错:RuntimeError: The size of tensor a (8) must match the size of tensor b (4) at non-singleton dimension 2

训练命令为:python tools/train.py --conf-file configs/model_yaml/yolov6t_GiraffeNeckV2_yaml.py --data data/data.yaml --device 0 --workers 16 --img-size 640 --batch-size 32

Model Summary: 550 layers, 13156006 parameters, 13155972 gradients
Loading state_dict from weights/yolov6t_yaml_new.pt for fine-tuning...
backbone.10.conv.weightcan not change
backbone.10.bn.weightcan not change
backbone.10.bn.biascan not change
backbone.10.bn.running_meancan not change
backbone.10.bn.running_varcan not change
backbone.14.conv.weightcan not change
backbone.14.bn.weightcan not change
backbone.14.bn.biascan not change
backbone.14.bn.running_meancan not change
backbone.14.bn.running_varcan not change
******************
transform_weight: 314/668 from weights/yolov6t_yaml_new.pt
transform_model_weight: 314/768 from weights/yolov6t_yaml_new.pt
******************
Training start...

     Epoch  iou_loss  dfl_loss  cls_loss
  0%|          | 0/186 [00:00<?, ?it/s]                                                                                                                                      /root/miniconda3/envs/py38/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  0%|          | 0/186 [00:00<?, ?it/s]                                                                                                                                      
ERROR in training steps.
ERROR in training loop or eval/save model.
Traceback (most recent call last):
  File "tools/train.py", line 127, in <module>
    main(args)
  File "tools/train.py", line 117, in main
    trainer.train()
  File "/root/autodl-nas/YOLOv6_pro-main/yolov6/core/engine.py", line 99, in train
    self.train_in_loop(self.epoch)
  File "/root/autodl-nas/YOLOv6_pro-main/yolov6/core/engine.py", line 113, in train_in_loop
    self.train_in_steps(epoch_num, self.step)
  File "/root/autodl-nas/YOLOv6_pro-main/yolov6/core/engine.py", line 142, in train_in_steps
    total_loss, loss_items = self.compute_loss(preds, targets, epoch_num, step_num)
  File "/root/autodl-nas/YOLOv6_pro-main/yolov6/models/loss.py", line 155, in __call__
    loss_cls = self.varifocal_loss(pred_scores, target_scores, one_hot_label)
  File "/root/miniconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/autodl-nas/YOLOv6_pro-main/yolov6/models/loss.py", line 196, in forward
    weight = alpha * pred_score.pow(gamma) * (1 - label) + gt_score * label
RuntimeError: The size of tensor a (8) must match the size of tensor b (4) at non-singleton dimension 2
yang-0201 commented 1 year ago

Thank you for your support,you need to extend the parameters of the three Head_layers modules in the file in configs/yaml/ to:

  1. s, t models change the parameters to [128,0, class_number]
  2. m, l model changes the parameter to [128,16, class_number] class_number is your class number This problem will be improved later.