训练出错问题 - Githubissues

xuexisuanfa commented 1 year ago

(yolov8obb) root@root:~/yolov8obb20230908/Yolov8_obb_Prune_Track-main$ python train.py --data 'data/yolov8obb_demo.yaml' --hyp 'data/hyps/obb/hyp.finetune_dota.yaml' --cfg models/yaml/yolov8n.yaml --epochs 300 --batch-size 4 --img 640 --device 0 /home/root/anaconda3/envs/yolov8obb/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. warnings.warn( train: weights=, cfg=models/yaml/yolov8n.yaml, data=data/yolov8obb_demo.yaml, hyp=data/hyps/obb/hyp.finetune_dota.yaml, epochs=300, batch_size=4, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=0, multi_scale=False, single_cls=False, adam=False, lion=False, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: skipping check (not a git repository), for updates see https://github.com/ultralytics/yolov5 YOLOv5 🚀 2023-9-6 torch 1.10.1+cu111 CUDA:0 (NVIDIA GeForce RTX 3060, 12053MiB)

hyperparameters: lr0=0.001, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, theta=0.5, theta_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=1.5, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0, translate=0.1, scale=0.25, shear=0.0, perspective=0.0, flipud=0.5, fliplr=0.5, mosaic=0.75, mixup=0.0, copy_paste=0.0, cls_theta=180, csl_radius=2.0, no_rotato_ratio=1.0 Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 🚀 runs (RECOMMENDED) TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/

               from  n    params  module                                  arguments

ch [3] 0 -1 1 1 -1 1 2 -1 1 3 -1 1 4 -1 2 5 -1 1 6 7 8 9 10 -1 1 11 [-1, 6] 1 12 13 -1 1 14 [-1, 4] 1 15 16 17 [-1, 12] 1 18 19 20 [-1, 9] 1 21 22 [15, 18, 21] 1 Model Summary: 312 464 models.common.Conv [3, 16, 3, 2]
4672 models.common.Conv [16, 32, 3, 2]
7360 models.common.C2f [32, 32, 1, True]
18560 models.common.Conv [32, 64, 3, 2]
49664 models.common.C2f [64, 64, 2, True]
73984 models.common.Conv [64, 128, 3, 2]
-1 2 197632 models.common.C2f [128, 128, 2, True]
-1 1 295424 models.common.Conv [128, 256, 3, 2]
-1 1 460288 models.common.C2f [256, 256, 1, True]
-1 1 164608 models.common.SPPF [256, 256, 5]
0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
0 models.common.Concat [1]
-1 1 148224 models.common.C2f [384, 128, 1]
0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
0 models.common.Concat [1]
-1 1 37248 models.common.C2f [192, 64, 1]
-1 1 36992 models.common.Conv [64, 64, 3, 2]
0 models.common.Concat [1]
-1 1 123648 models.common.C2f [192, 128, 1]
-1 1 147712 models.common.Conv [128, 128, 3, 2]
0 models.common.Concat [1]
-1 1 493056 models.common.C2f [384, 256, 1]
1121305 models.yolo.Detect_v8 [2, [64, 128, 256]]
layers, 3380841 parameters, 3380825 gradients, 9.6 GFLOPs

Scaled weight_decay = 0.0005 optimizer: SGD with parameter groups 63 weight, 73 weight (no decay), 72 bias train: Scanning '/home/root/dataout_jpg/labelTxt' images and labels...20125 found, 0 missing, 11796 empty, 0 corrupted: 100%|█████████████████████████████████████████████████████████| 20125/20125 [00:01<00:00, 14827.96it/s] train: New cache created: /home/root/dataout_jpg/labelTxt.cache val: Scanning '/home/root/dataout_jpg/labelTxt' images and labels...20125 found, 0 missing, 11796 empty, 0 corrupted: 100%|███████████████████████████████████████████████████████████| 20125/20125 [00:01<00:00, 14881.40it/s] val: New cache created: /home/root/dataout_jpg/labelTxt.cache Plotting labels to runs/train/exp2/labels_xyls.jpg... Image sizes 640 train, 640 val Using 4 dataloader workers Logging results to runs/train/exp2 Starting training for 300 epochs...

 Epoch   gpu_mem       box       cls       dfl    labels  img_size
 0/299     1.11G     5.268   0.02735   0.09291         2       640:  58%|█████▊    | 2908/5032 [02:35<01:53, 18.71it/s]

Traceback (most recent call last): File "train.py", line 646, in main(opt) File "train.py", line 543, in main train(opt.hyp, opt, device, callbacks) File "train.py", line 331, in train loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size File "/home/root/yolov8obb20230908/Yolov8_obb_Prune_Track-main/utils/loss.py", line 85, in call target_labels, target_bboxes, target_scores, fgmask, = self.assigner( File "/home/root/anaconda3/envs/yolov8obb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/root/anaconda3/envs/yolov8obb/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(args, *kwargs) File "/home/root/yolov8obb20230908/Yolov8_obb_Prune_Track-main/utils/tal.py", line 133, in forward mask_pos, align_metric, overlaps = self.get_pos_mask(pd_scores, pd_bboxes, gt_labels, gt_bboxes, anc_points, File "/home/root/yolov8obb20230908/Yolov8_obb_Prune_Track-main/utils/tal.py", line 156, in get_pos_mask align_metric, overlaps = self.get_box_metrics(pd_scores, pd_bboxes, gt_labels, gt_bboxes,mask_in_gts mask_gt) File "/home/root/yolov8obb20230908/Yolov8_obb_Prune_Track-main/utils/tal.py", line 209, in get_box_metrics overlaps[mask_gt]=rotated_iou_similarity(gt_boxes,pd_boxes) File "/home/root/yolov8obb20230908/Yolov8_obb_Prune_Track-main/utils/tal.py", line 91, in rotated_iou_similarity return torch.stack(rotated_ious, axis=0) RuntimeError: stack expects a non-empty TensorList

ulyduts commented 1 year ago

我也是遇到这个问题，请问您解决了吗

xuexisuanfa commented 1 year ago

我也是遇到这个问题，请问您解决了吗

已经查到了原因，tal.py中get_box_metrics函数中mask_gt导致pd_boxes和gt_boxes为空。如果你解决了这个问题，告诉我们一声。

yzqxy commented 1 year ago

我也是遇到这个问题，请问您解决了吗

已经查到了原因，tal.py中get_box_metrics函数中mask_gt导致pd_boxes和gt_boxes为空。如果你解决了这个问题，告诉我们一声。

因为你的数据里有挺多空标签的图，所以如果一个batch里读取的图刚好全是空标签的就会导致gt_boxes为空，导致报错。建议把空标签的图训练前先过滤掉。不过这也算是个小不过，后续我会优化一下

ulyduts commented 1 year ago

我也是遇到这个问题，请问您解决了吗

已经查到了原因，tal.py中get_box_metrics函数中mask_gt导致pd_boxes和gt_boxes为空。如果你解决了这个问题，告诉我们一声。

因为你的数据里有挺多空标签的图，所以如果一个batch里读取的图刚好全是空标签的就会导致gt_boxes为空，导致报错。建议把空标签的图训练前先过滤掉。不过这也算是个小不过，后续我会优化一下

我的数据检查了下，没有空标签，最少都是一行数据。还有可能是其他什么原因吗？

ulyduts commented 1 year ago

我也是遇到这个问题，请问您解决了吗

已经查到了原因，tal.py中get_box_metrics函数中mask_gt导致pd_boxes和gt_boxes为空。如果你解决了这个问题，告诉我们一声。

因为你的数据里有挺多空标签的图，所以如果一个batch里读取的图刚好全是空标签的就会导致gt_boxes为空，导致报错。建议把空标签的图训练前先过滤掉。不过这也算是个小不过，后续我会优化一下

我的数据检查了下，没有空标签，最少都是一行数据。还有可能是其他什么原因吗？

而且可以运行完第一轮，第二轮才报的这个错。

hecheng000 commented 1 year ago

Traceback (most recent call last): File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 648, in main(opt) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 545, in main train(opt.hyp, opt, device, callbacks) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 118, in train model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc).to(device) # create File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 180, in init self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 332, in parse_model m = eval(m) if isinstance(m, str) else m # eval strings File "", line 1, in NameError: name 'Detect' is not defined

ChairyX commented 7 months ago

Traceback (most recent call last): File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 648, in main(opt) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 545, in main train(opt.hyp, opt, device, callbacks) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 118, in train model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc).to(device) # create File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 180, in init self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 332, in parse_model m = eval(m) if isinstance(m, str) else m # eval strings File "", line 1, in NameError: name 'Detect' is not defined

我也遇到了这个问题，解决了吗？

hecheng000 commented 6 months ago

使用官方的YOLOv8_obb

在 2024-03-14 20:50:34，"ChairyX" @.***> 写道：

Traceback (most recent call last): File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 648, in main(opt) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 545, in main train(opt.hyp, opt, device, callbacks) File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/train.py", line 118, in train model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc).to(device) # create File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 180, in init self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist File "/home/hc/下载/yolo/Yolov8_obb_Prune_Track-main/models/yolo.py", line 332, in parse_model m = eval(m) if isinstance(m, str) else m # eval strings File "", line 1, in NameError: name 'Detect' is not defined

我也遇到了这个问题，解决了吗？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

yzqxy / Yolov8_obb_Prune_Track

训练出错问题 #3