Closed thusinh1969 closed 3 years ago
👋 Hello @thusinh1969, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@thusinh1969 for a large dataset it may take a few minutes to initialize the dataloaders. Does COCO training work for you?
python train.py --data coco.yaml
Found the bug: upgrade wandb :( !!! Works now.
Thanks, Steve
I used a scaled-down limited version of OpenImage V6 images which has about 250,000 images.
** YAML file:
Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../ # dataset root dir train: train # train images (relative to 'path') 128 images val: test # val images (relative to 'path') 128 images
Classes
nc: 66 # number of classes names: ['Ladder', 'Sink', 'Home appliance', 'Tent', 'Lantern', 'Stairs', 'Chair', 'Cabinetry', 'Bidet', 'Desk', 'Bronze sculpture', 'Fountain', 'Christmas tree', 'Studio couch', 'Wine rack', 'Couch', 'Door', 'Shower', 'Wardrobe', 'Tree house', 'Nightstand', 'Window blind', 'Bathtub', 'Houseplant', 'House', 'Ceiling fan', 'Sofa bed', 'Heater', 'Curtain', 'Bed', 'Fireplace', 'Bookcase', 'Refrigerator', 'Wood-burning stove', 'Filing cabinet', 'Table', 'Tableware', 'Porch', 'Billiard table', 'Bathroom cabinet', 'Mirror', 'Chest of drawers', 'Infant bed', 'Cupboard', 'Jacuzzi', 'Sculpture', 'Picture frame', 'Loveseat', 'Coffee table', 'Toilet', 'Countertop', 'Waste container', 'Swimming pool', 'Furniture', 'Bench', 'Window', 'Closet', 'Lamp', 'Flowerpot', 'Drawer', 'Stool', 'Shelf', 'Spice rack', 'Kitchen & dining room table', 'Dog bed', 'Cat furniture'] # class names
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3) lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf) momentum: 0.937 # SGD momentum/Adam beta1 weight_decay: 0.0005 # optimizer weight decay 5e-4 warmup_epochs: 3.0 # warmup epochs (fractions ok) warmup_momentum: 0.8 # warmup initial momentum warmup_bias_lr: 0.1 # warmup initial bias lr box: 0.05 # box loss gain cls: 0.5 # cls loss gain cls_pw: 1.0 # cls BCELoss positive_weight obj: 1.0 # obj loss gain (scale with pixels) obj_pw: 1.0 # obj BCELoss positive_weight iou_t: 0.20 # IoU training threshold anchor_t: 4.0 # anchor-multiple threshold
anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5) hsv_h: 0.015 # image HSV-Hue augmentation (fraction) hsv_s: 0.7 # image HSV-Saturation augmentation (fraction) hsv_v: 0.4 # image HSV-Value augmentation (fraction) degrees: 0.0 # image rotation (+/- deg) translate: 0.1 # image translation (+/- fraction) scale: 0.5 # image scale (+/- gain) shear: 0.0 # image shear (+/- deg) perspective: 0.0 # image perspective (+/- fraction), range 0-0.001 flipud: 0.0 # image flip up-down (probability) fliplr: 0.5 # image flip left-right (probability) mosaic: 1.0 # image mosaic (probability) mixup: 0.0 # image mixup (probability) copy_paste: 0.0 # segment copy-paste (probability)
## 🐛 Errors:
It starts, scanning and found some corrupt images/labels. And then hang right here. GPU took 2.5G, memory is not jumping and there is no activity in wanddb.
(Steve38_WIN) nguyen@hatto2:~/OpenImage/YoloV5/yolov5$ python train.py --batch 8 --img-size 640 --data ../steve_openimage.yaml --weights ../pretrained/yolov5x.pt --device 0 train: weights=../pretrained/yolov5x.pt, cfg=, data=../steve_openimage.yaml, hyp=data/hyps/hyp.scratch.yaml, epochs=300, batch_size=8, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache_images=False, image_weights=False, device=0, multi_scale=False, single_cls=False, adam=False, sync_bn=False, workers=8, project=runs/train, entity=None, name=exp, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, upload_dataset=False, bbox_interval=-1, save_period=-1, artifact_alias=latest, local_rank=-1 github: ⚠️ WARNING: code is out of date by 1 commit. Use 'git pull' to update or 'git clone https://github.com/ultralytics/yolov5' to download latest. YOLOv5 🚀 v5.0-303-g3bef77f torch 1.7.1 CUDA:0 (NVIDIA GeForce RTX 2080 Ti, 11019.0625MB)
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 tensorboard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/ 2021-07-21 23:13:27.035115: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 wandb: Currently logged in as: hatto (use
wandb login --relogin
to force relogin) wandb: wandb version 0.11.0 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade 2021-07-21 23:13:31.765893: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 wandb: Tracking run with wandb version 0.10.28 wandb: Syncing run exp6 wandb: ⭐⭐️ View project at https://wandb.ai/hatto/YOLOv5 wandb: 🚀 View run at https://wandb.ai/hatto/YOLOv5/runs/26sf48h2 wandb: Run data is saved locally in /home/nguyen/OpenImage/YoloV5/yolov5/wandb/run-20210721_231329-26sf48h2 wandb: Runwandb offline
to turn off syncing.Overriding model.yaml nc=80 with nc=66
0 -1 1 8800 models.common.Focus [3, 80, 3] 1 -1 1 115520 models.common.Conv [80, 160, 3, 2] 2 -1 1 309120 models.common.C3 [160, 160, 4] 3 -1 1 461440 models.common.Conv [160, 320, 3, 2] 4 -1 1 3285760 models.common.C3 [320, 320, 12] 5 -1 1 1844480 models.common.Conv [320, 640, 3, 2] 6 -1 1 13125120 models.common.C3 [640, 640, 12] 7 -1 1 7375360 models.common.Conv [640, 1280, 3, 2] 8 -1 1 4099840 models.common.SPP [1280, 1280, [5, 9, 13]] 9 -1 1 19676160 models.common.C3 [1280, 1280, 4, False] 10 -1 1 820480 models.common.Conv [1280, 640, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 5332480 models.common.C3 [1280, 640, 4, False] 14 -1 1 205440 models.common.Conv [640, 320, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 1335040 models.common.C3 [640, 320, 4, False] 18 -1 1 922240 models.common.Conv [320, 320, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 4922880 models.common.C3 [640, 640, 4, False] 21 -1 1 3687680 models.common.Conv [640, 640, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 19676160 models.common.C3 [1280, 1280, 4, False] 24 [17, 20, 23] 1 477759 models.yolo.Detect [66, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [320, 640, 1280]] Model Summary: 607 layers, 87681759 parameters, 87681759 gradients, 218.7 GFLOPs
Transferred 788/794 items from ../pretrained/yolov5x.pt Scaled weight_decay = 0.0005 Optimizer groups: 134 .bias, 134 conv.weight, 131 other albumentations: Blur(always_apply=False, p=0.1, blur_limit=(3, 7)), MedianBlur(always_apply=False, p=0.1, blur_limit=(3, 7)), ToGray(always_apply=False, p=0.01) train: Scanning '../train/labels' images and labels...254739 found, 12 missing, 7791 empty, 375 corrupted: 100%|███████████████████████████████████████████| 254751/254751 [00:34<00:00, 7311.04it/s] train: WARNING: Ignoring corrupted image and/or label ../train/images/0071689b11f8a240.jpg: duplicate labels train: WARNING: Ignoring corrupted image and/or label ../train/images/0071689b11f8a240.jpg: duplicate labels train: WARNING: Ignoring corrupted image and/or label ../train/images/00b4a9339181a90b.jpg: duplicate labels
It hung right there !
Any help is appreciated. Steve