ultralytics / yolov5

YOLOv5 šŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.23k stars 16.44k forks source link

Training on Custom Data #9311

Closed karl-gardner closed 2 years ago

karl-gardner commented 2 years ago

Search before asking

YOLOv5 Component

No response

Bug

Hello,

I transferred all contents of the yolov5 v6.1 repository over to my directory on github on 4/28/2022:

https://github.com/karl-gardner/droplet_detection/tree/master/yolov5

At this time the custom training was working. However, now I try to train on my custom dataset and I receive an error:

/content/droplet_detection/yolov5 train: weights=, cfg=./models/yolov5m.yaml, data=../yaml_files/droplet_model.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=5, batch_size=32, imgsz=544, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: skipping check (not a git repository), for updates see https://github.com/ultralytics/yolov5 YOLOv5 šŸš€ 2022-9-6 torch 1.12.1+cu113 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 šŸš€ runs (RECOMMENDED) TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/ Downloading https://ultralytics.com/assets/Arial.ttf to /root/.config/Ultralytics/Arial.ttf... 100% 755k/755k [00:00<00:00, 28.5MB/s] Overriding model.yaml nc=80 with nc=4

             from  n    params  module                                  arguments                     

0 -1 1 5280 models.common.Conv [3, 48, 6, 2, 2]
1 -1 1 41664 models.common.Conv [48, 96, 3, 2]
2 -1 2 65280 models.common.C3 [96, 96, 2]
3 -1 1 166272 models.common.Conv [96, 192, 3, 2]
4 -1 4 444672 models.common.C3 [192, 192, 4]
5 -1 1 664320 models.common.Conv [192, 384, 3, 2]
6 -1 6 2512896 models.common.C3 [384, 384, 6]
7 -1 1 2655744 models.common.Conv [384, 768, 3, 2]
8 -1 2 4134912 models.common.C3 [768, 768, 2]
9 -1 1 1476864 models.common.SPPF [768, 768, 5]
10 -1 1 295680 models.common.Conv [768, 384, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 2 1182720 models.common.C3 [768, 384, 2, False]
14 -1 1 74112 models.common.Conv [384, 192, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 2 296448 models.common.C3 [384, 192, 2, False]
18 -1 1 332160 models.common.Conv [192, 192, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 2 1035264 models.common.C3 [384, 384, 2, False]
21 -1 1 1327872 models.common.Conv [384, 384, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 2 4134912 models.common.C3 [768, 768, 2, False]
24 [17, 20, 23] 1 36369 models.yolo.Detect [4, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [192, 384, 768]] YOLOv5m summary: 369 layers, 20883441 parameters, 20883441 gradients, 48.3 GFLOPs

Scaled weight_decay = 0.0005 optimizer: SGD with parameter groups 79 weight (no decay), 82 weight, 82 bias albumentations: Blur(always_apply=False, p=0.01, blur_limit=(3, 7)), MedianBlur(always_apply=False, p=0.01, blur_limit=(3, 7)), ToGray(always_apply=False, p=0.01), CLAHE(always_apply=False, p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning '/content/droplet_detection/yolov5/../train/labels' images and labels...412 found, 0 missing, 0 empty, 0 corrupt: 100% 412/412 [00:00<00:00, 1303.58it/s] train: New cache created: /content/droplet_detection/yolov5/../train/labels.cache train: Caching images (0.4GB ram): 100% 412/412 [00:01<00:00, 262.75it/s] val: Scanning '/content/droplet_detection/yolov5/../valid/labels' images and labels...103 found, 0 missing, 0 empty, 0 corrupt: 100% 103/103 [00:00<00:00, 571.20it/s] val: WARNING: /content/droplet_detection/yolov5/../valid/images/test_016241_png.rf.661d45690d6a99569e6ae72cd95aced7.jpg: 1 duplicate labels removed val: New cache created: /content/droplet_detection/yolov5/../valid/labels.cache val: Caching images (0.1GB ram): 100% 103/103 [00:01<00:00, 100.24it/s] Plotting labels to runs/train/exp/labels.jpg...

AutoAnchor: 5.92 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset āœ… Image sizes 544 train, 544 val Using 2 dataloader workers Logging results to runs/train/exp Starting training for 5 epochs...

 Epoch   gpu_mem       box       obj       cls    labels  img_size

0% 0/13 [00:06<?, ?it/s] Traceback (most recent call last): File "train.py", line 668, in main(opt) File "train.py", line 563, in main train(opt.hyp, opt, device, callbacks) File "train.py", line 350, in train loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size File "/content/droplet_detection/yolov5/utils/loss.py", line 125, in call tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets File "/content/droplet_detection/yolov5/utils/loss.py", line 229, in buildtargets indices.append((b, a, gj.clamp(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices RuntimeError: result type Float can't be cast to the desired output type long int

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

glenn-jocher commented 2 years ago

@karl-gardner šŸ‘‹ hi, thanks for letting us know about this possible problem with YOLOv5 šŸš€. We've created a few short guidelines below to help users provide what we need in order to start investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

For Ultralytics to provide assistance your code should also be:

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the šŸ› Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! šŸ˜ƒ

github-actions[bot] commented 2 years ago

šŸ‘‹ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 šŸš€ resources:

Access additional Ultralytics āš” resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 šŸš€ and Vision AI ā­!