Closed xuk997 closed 2 years ago
👋 Hello @xuk997, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
在训练期间遇到的两个报错:
1、RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity。这个问题不是训练过程中百分之百触发的,之前按照同样的参数和数据集训练过两轮,完全没问题。问题原因是generalpy中non_max_suppression函数的 i, j = (x[:, 5:5+nc] > conf_thres).nonzero(as_tuple=False).T的i,j为空tensor。具体报错如下:
Starting training for 500 epochs... Epoch gpu_mem box obj cls angle labels img_size 0/499 3.03G 0.08573 0.00518 0.01223 0.1266 18 640: 100%|█| 147/147 Epoch gpu_mem box obj cls angle labels img_size 1/499 3.03G 0.07211 0.004031 0.0007491 0.08157 18 640: 100%|███████████████████████████| 147/147 [01:23<00:00, 1.77it/s] Class Images Labels P R mAP@.5 mAP@.5:.95: 95%|█████████████████████▊ | 18/19 [00:05<00:00, 3.25it/s] Traceback (most recent call last): File "train.py", line 601, in <module> main(opt) File "train.py", line 499, in main train(opt.hyp, opt, device) File "train.py", line 353, in train results, maps, _ = val.run(data_dict, File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context return func(*args, **kwargs) File "D:\Rotation-Detect-yolov5_poly-master\val.py", line 185, in run out = non_max_suppression(out, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls) File "D:\Rotation-Detect-yolov5_poly-master\utils\general.py", line 715, in non_max_suppression conf_angle, j_angle = x[i, 5+nc:].max(1, keepdim=True) RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity
2、RuntimeError: The expanded size of the tensor (4) must match the existing size (2) at non-singleton dimension 0. Target sizes: [4, 6]. Tensor sizes: [2, 6]。之前训练时使用的是作者您的hyp.finetune_objects365.yaml配置文件。我进行参数调整训练后,百分之百会报这个错误。错误原因为:
θ计算出现异常,当前数据为:296.6229553222656250, 614.3365478515625000, 0.0000000000000000, 29.1324348449707031, 180.0;超出opencv表示法的范围:[-90,0) θ计算出现异常,当前数据为:432.7497558593750000, 615.0087890625000000, 0.0000000000000000, 20.7464981079101562, 180.0;超出opencv表示法的范围:[-90,0)
参数配置为:lr0=0.001, lrf=0.17, momentum=0.779, weight_decay=0.00036, warmup_epochs=2, warmup_momentum=0.5, warmup_bias_lr=0.05, box=0.0296, cls=0.243, cls_pw=0.631, obj=0.301, obj_pw=0.911, angle=0.266, angle_pw=0.333, iou_t=0.2, anchor_t=3.44, anchors=3.2, fl_gamma=0.0, hsv_h=0.0188, hsv_s=0.704, hsv_v=0.36, degrees=0.0, translate=0.245, scale=0.898, shear=0.602, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.243, copy_paste=0.0 opencv-python版本为4.1.2.30
Starting training for 500 epochs... Epoch gpu_mem box obj cls angle labels img_size 0/499 3.05G 0.04776 0.004455 0.0354 0.2496 25 640: 20%|█████▌ | 29/147 [00:25<01:43, 1.14it/s] Traceback (most recent call last): File "train.py", line 601, in <module> main(opt) File "train.py", line 499, in main train(opt.hyp, opt, device) File "train.py", line 290, in train for i, (imgs, targets, paths, _) in pbar: # batch ------------------------------------------------------------- File "D:\Anaconda\envs\yolo-dota\lib\site-packages\tqdm\std.py", line 1180, in __iter__ for obj in iterable: File "D:\Rotation-Detect-yolov5_poly-master\utils\datasets.py", line 314, in __iter__ yield next(self.iterator) File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\dataloader.py", line 363, in __next__ data = self._next_data() File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\dataloader.py", line 989, in _next_data return self._process_data(data) File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\dataloader.py", line 1014, in _process_data data.reraise() File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\_utils\worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\Anaconda\envs\yolo-dota\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\Rotation-Detect-yolov5_poly-master\utils\datasets.py", line 820, in __getitem__ labels_out[:, 1:] = torch.from_numpy(labels_new[:, 0:6]) RuntimeError: The expanded size of the tensor (4) must match the existing size (2) at non-singleton dimension 0. Target sizes: [4, 6]. Tensor sizes: [2, 6]
菜鸟只能发现问题,还无法解决。希望大佬赐教。
当前项目确实存在一些问题,目前正在debug,你可以先去试试readme中提到的其他两个旋转目标检测的项目
嗯嗯,谢谢~
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
在训练期间遇到的两个报错:
1、RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity。这个问题不是训练过程中百分之百触发的,之前按照同样的参数和数据集训练过两轮,完全没问题。问题原因是generalpy中non_max_suppression函数的 i, j = (x[:, 5:5+nc] > conf_thres).nonzero(as_tuple=False).T的i,j为空tensor。具体报错如下:
2、RuntimeError: The expanded size of the tensor (4) must match the existing size (2) at non-singleton dimension 0. Target sizes: [4, 6]. Tensor sizes: [2, 6]。之前训练时使用的是作者您的hyp.finetune_objects365.yaml配置文件。我进行参数调整训练后,百分之百会报这个错误。错误原因为:
参数配置为:lr0=0.001, lrf=0.17, momentum=0.779, weight_decay=0.00036, warmup_epochs=2, warmup_momentum=0.5, warmup_bias_lr=0.05, box=0.0296, cls=0.243, cls_pw=0.631, obj=0.301, obj_pw=0.911, angle=0.266, angle_pw=0.333, iou_t=0.2, anchor_t=3.44, anchors=3.2, fl_gamma=0.0, hsv_h=0.0188, hsv_s=0.704, hsv_v=0.36, degrees=0.0, translate=0.245, scale=0.898, shear=0.602, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.243, copy_paste=0.0 opencv-python版本为4.1.2.30
菜鸟只能发现问题,还无法解决。希望大佬赐教。