googlecolab / colabtools

Python libraries for Google Colaboratory
Apache License 2.0
2.19k stars 720 forks source link

Google Drive Disconnect During ML Training #3950

Closed emersondelemmus closed 1 year ago

emersondelemmus commented 1 year ago

I am training a YOLOV8 model with instance segmentation using the following command:

!yolo task=segment mode=train batch=-1 model=yolov8l-seg.pt data=data.yaml epochs=150 imgsz=2176 save=true

The following is the training output. (Error in bold).

__ Ultralytics YOLOv8.0.28 🚀 Python-3.10.12 torch-2.0.1+cu118 CUDA:0 (NVIDIA A100-SXM4-40GB, 40514MiB) yolo/engine/trainer: task=segment, mode=train, model=yolov8l-seg.pt, data=data.yaml, epochs=150, patience=50, batch=-1, imgsz=2176, save=True, cache=False, device=None, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, fl_gamma=0.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0, cfg=None, v5loader=False, save_dir=runs/segment/train9 Downloading https://ultralytics.com/assets/Arial.ttf to /root/.config/Ultralytics/Arial.ttf... 100% 755k/755k [00:00<00:00, 138MB/s] 2023-08-08 00:22:40.271747: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Overriding model.yaml nc=80 with nc=3

               from  n    params  module                                       arguments                     

0 -1 1 1856 ultralytics.nn.modules.Conv [3, 64, 3, 2]
1 -1 1 73984 ultralytics.nn.modules.Conv [64, 128, 3, 2]
2 -1 3 279808 ultralytics.nn.modules.C2f [128, 128, 3, True]
3 -1 1 295424 ultralytics.nn.modules.Conv [128, 256, 3, 2]
4 -1 6 2101248 ultralytics.nn.modules.C2f [256, 256, 6, True]
5 -1 1 1180672 ultralytics.nn.modules.Conv [256, 512, 3, 2]
6 -1 6 8396800 ultralytics.nn.modules.C2f [512, 512, 6, True]
7 -1 1 2360320 ultralytics.nn.modules.Conv [512, 512, 3, 2]
8 -1 3 4461568 ultralytics.nn.modules.C2f [512, 512, 3, True]
9 -1 1 656896 ultralytics.nn.modules.SPPF [512, 512, 5]
10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
11 [-1, 6] 1 0 ultralytics.nn.modules.Concat [1]
12 -1 3 4723712 ultralytics.nn.modules.C2f [1024, 512, 3]
13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
14 [-1, 4] 1 0 ultralytics.nn.modules.Concat [1]
15 -1 3 1247744 ultralytics.nn.modules.C2f [768, 256, 3]
16 -1 1 590336 ultralytics.nn.modules.Conv [256, 256, 3, 2]
17 [-1, 12] 1 0 ultralytics.nn.modules.Concat [1]
18 -1 3 4592640 ultralytics.nn.modules.C2f [768, 512, 3]
19 -1 1 2360320 ultralytics.nn.modules.Conv [512, 512, 3, 2]
20 [-1, 9] 1 0 ultralytics.nn.modules.Concat [1]
21 -1 3 4723712 ultralytics.nn.modules.C2f [1024, 512, 3]
22 [15, 18, 21] 1 7891321 ultralytics.nn.modules.Segment [3, 32, 256, [256, 512, 512]] YOLOv8l-seg summary: 401 layers, 45938361 parameters, 45938345 gradients, 220.8 GFLOPs

Transferred 651/657 items from pretrained weights AutoBatch: Computing optimal batch size for imgsz=2176 AutoBatch: CUDA:0 (NVIDIA A100-SXM4-40GB) 39.56G total, 0.35G reserved, 0.35G allocated, 38.86G free Params GFLOPs GPU_mem (GB) forward (ms) backward (ms) input output 45938361 2552 14.506 69.34 nan (1, 3, 2176, 2176) list 45938361 5105 28.242 76.7 nan (2, 3, 2176, 2176) list CUDA out of memory. Tried to allocate 578.00 MiB (GPU 0; 39.56 GiB total capacity; 35.69 GiB already allocated; 94.56 MiB free; 36.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF CUDA out of memory. Tried to allocate 1.13 GiB (GPU 0; 39.56 GiB total capacity; 35.16 GiB already allocated; 136.56 MiB free; 36.37 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF CUDA out of memory. Tried to allocate 578.00 MiB (GPU 0; 39.56 GiB total capacity; 34.86 GiB already allocated; 174.56 MiB free; 36.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF AutoBatch: Using batch-size 1 for CUDA:0 15.21G/39.56G (38%) ✅ optimizer: SGD(lr=0.01) with parameter groups 106 weight(decay=0.0), 117 weight(decay=0.0005), 116 bias train: Scanning /content/drive/.shortcut-targets-by-id/1of8frlV3H1_GB4M8xYLa96K0MYP-Xjx5/Fire Behavior/Data/YOLOv6/7_20_2023/labels/train.cache... 376 images, 0 backgrounds, 0 corrupt: 100% 376/376 [00:00<?, ?it/s] albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) val: Scanning /content/drive/.shortcut-targets-by-id/1of8frlV3H1_GB4M8xYLa96K0MYP-Xjx5/Fire Behavior/Data/YOLOv6/7_20_2023/labels/valid.cache... 46 images, 0 backgrounds, 0 corrupt: 100% 46/46 [00:00<?, ?it/s] Image sizes 2176 train, 2176 val Using 0 dataloader workers Logging results to runs/segment/train9 Starting training for 150 epochs...

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  1/150      17.3G      2.414      4.474      7.251      2.528         16       2176: 100% 376/376 [04:54<00:00,  1.28it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:24<00:00,  1.07s/it]
               all         46        133      0.215     0.0557     0.0499      0.017      0.234     0.0557     0.0448     0.0142

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  2/150      19.6G      2.217      3.831       4.12      2.212         12       2176: 100% 376/376 [04:30<00:00,  1.39it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.462     0.0799     0.0701     0.0268      0.405     0.0753     0.0767     0.0241

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  3/150      19.6G      2.182      3.791      3.521      2.182         10       2176: 100% 376/376 [04:31<00:00,  1.39it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.206      0.116     0.0984     0.0331       0.12      0.188     0.0906     0.0351

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  4/150      19.6G      2.131      3.669      3.465      2.153          8       2176: 100% 376/376 [04:31<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133     0.0373      0.255     0.0345    0.00939     0.0367       0.17     0.0403     0.0118

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  5/150      19.6G       2.09      3.639      3.421      2.089          8       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.33it/s]
               all         46        133     0.0587      0.109     0.0481     0.0203      0.126     0.0976      0.058     0.0222

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  6/150      19.6G      2.029      3.685      3.123      2.105          3       2176: 100% 376/376 [04:33<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133     0.0948     0.0615     0.0458     0.0172        0.1     0.0522     0.0365     0.0154

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  7/150      19.6G      2.074      3.649      3.228      2.131          7       2176: 100% 376/376 [04:31<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.33it/s]
               all         46        133     0.0847      0.164     0.0771     0.0262     0.0959      0.143     0.0761     0.0224

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  8/150      19.6G      2.005      3.526      3.039      2.097          7       2176: 100% 376/376 [04:33<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.31it/s]
               all         46        133       0.15      0.163      0.088     0.0375      0.214      0.165     0.0895     0.0334

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
  9/150      19.6G      2.047      3.643      3.146      2.122          8       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.30it/s]
               all         46        133      0.157      0.108     0.0607     0.0221      0.196     0.0506     0.0666     0.0186

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 10/150      19.6G      1.978      3.549      3.142      2.055          5       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133      0.142      0.113     0.0573     0.0201      0.168      0.103     0.0597     0.0167

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 11/150      19.6G      1.965      3.434      2.985      2.055          4       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.198      0.216      0.105     0.0379      0.189      0.208     0.0975     0.0336

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 12/150      19.6G          2      3.519      2.929      2.039          7       2176: 100% 376/376 [04:30<00:00,  1.39it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.153       0.21      0.106     0.0471       0.18       0.22     0.0977     0.0313

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 13/150      19.6G      1.963      3.482      2.949       2.07          2       2176: 100% 376/376 [04:28<00:00,  1.40it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.46it/s]
               all         46        133      0.116      0.174     0.0782     0.0249     0.0968      0.149      0.062     0.0187

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 14/150      19.6G      1.962      3.407      2.896      2.066          7       2176: 100% 376/376 [04:33<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.29it/s]
               all         46        133      0.108      0.173     0.0693     0.0283      0.155       0.13     0.0784     0.0252

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 15/150      19.6G      1.955      3.341       2.83      2.014          6       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133       0.16      0.174      0.116     0.0453      0.216      0.126      0.105     0.0381

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 16/150      19.6G      1.939      3.339      2.845      2.009          7       2176: 100% 376/376 [04:27<00:00,  1.41it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.174       0.23      0.125      0.048      0.159      0.209       0.11     0.0443

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 17/150      19.6G       1.87       3.32      2.753      1.984         10       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.158      0.176     0.0973      0.036      0.153      0.161     0.0924     0.0314

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 18/150      19.6G      1.938      3.314      2.804       2.02         14       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.171      0.236      0.107     0.0395      0.142      0.194      0.103     0.0311

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 19/150      19.6G      1.902      3.333      2.662      2.053          2       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.33it/s]
               all         46        133      0.149      0.191      0.103     0.0469      0.254      0.113     0.0846     0.0281

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 20/150      19.6G      1.863      3.329      2.719      2.007         17       2176: 100% 376/376 [04:35<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.30it/s]
               all         46        133       0.35      0.128        0.1     0.0382      0.309      0.128     0.0884     0.0293

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 21/150      19.6G      1.956      3.306      2.701      1.995          6       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.33it/s]
               all         46        133      0.149      0.245      0.105     0.0415      0.132      0.219     0.0897     0.0273

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 22/150      19.6G      1.825      3.214      2.622      1.979          2       2176: 100% 376/376 [04:33<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.36it/s]
               all         46        133      0.108      0.215      0.105     0.0366      0.118      0.256     0.0947     0.0322

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 23/150      19.6G      1.877      3.222       2.67      1.944          6       2176: 100% 376/376 [04:33<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.29it/s]
               all         46        133      0.197      0.237      0.104     0.0408      0.151      0.208     0.0861     0.0314

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 24/150      19.6G      1.819      3.164      2.505      1.968          2       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.36it/s]
               all         46        133      0.173      0.291      0.129     0.0511      0.174      0.277      0.131     0.0465

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 25/150      19.6G      1.797      3.109      2.498      1.958          8       2176: 100% 376/376 [04:35<00:00,  1.36it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133       0.18      0.363      0.157     0.0537      0.183      0.247      0.122     0.0449

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 26/150      19.6G      1.784      3.106      2.454       1.95         15       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.30it/s]
               all         46        133      0.218      0.277      0.153     0.0636      0.168      0.317      0.129     0.0491

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 27/150      19.6G      1.825      3.068      2.451      1.915         11       2176: 100% 376/376 [04:35<00:00,  1.36it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.31it/s]
               all         46        133      0.153       0.31       0.13     0.0534      0.138      0.284      0.123     0.0424

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 28/150      19.6G      1.807      3.088      2.441       1.91          4       2176: 100% 376/376 [04:35<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.31it/s]
               all         46        133      0.178      0.266       0.12     0.0495      0.149      0.247      0.119     0.0361

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 29/150      19.6G      1.848      3.063      2.439      1.917          7       2176: 100% 376/376 [04:26<00:00,  1.41it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.47it/s]
               all         46        133      0.153      0.223      0.147      0.063      0.137      0.208      0.129     0.0489

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 30/150      19.6G      1.806      3.061      2.443      1.909          6       2176: 100% 376/376 [04:27<00:00,  1.41it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.48it/s]
               all         46        133      0.231      0.309      0.192     0.0689      0.216      0.301      0.182     0.0626

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 31/150      19.6G      1.823      3.082      2.342      1.923          9       2176: 100% 376/376 [04:28<00:00,  1.40it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.45it/s]
               all         46        133      0.188      0.345      0.157     0.0633      0.238      0.282      0.161     0.0563

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 32/150      19.6G      1.824      3.067      2.334      1.877          8       2176: 100% 376/376 [04:27<00:00,  1.41it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.47it/s]
               all         46        133      0.192      0.305      0.173     0.0711      0.187      0.279      0.136     0.0555

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 33/150      19.6G      1.704      2.983      2.227      1.853         13       2176: 100% 376/376 [04:28<00:00,  1.40it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.48it/s]
               all         46        133      0.272      0.303      0.193     0.0784      0.249      0.274      0.165     0.0675

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 34/150      19.6G      1.757       2.98      2.337      1.824          9       2176: 100% 376/376 [04:27<00:00,  1.41it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.47it/s]
               all         46        133      0.206      0.247      0.139      0.063      0.199      0.224      0.123     0.0483

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 35/150      19.6G      1.674      2.989      2.265      1.876          4       2176: 100% 376/376 [04:34<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.254      0.281      0.181     0.0734       0.22      0.321      0.174     0.0626

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 36/150      19.6G       1.74      2.968      2.275      1.892          9       2176: 100% 376/376 [04:31<00:00,  1.39it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133      0.206      0.252      0.149     0.0619      0.178      0.251       0.12     0.0491

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 37/150      19.6G      1.742      2.947      2.278      1.864          6       2176: 100% 376/376 [04:33<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133      0.191      0.277      0.152     0.0638      0.174      0.256      0.154     0.0582

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 38/150      19.6G      1.691      2.937      2.225      1.835          9       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.194      0.252      0.151     0.0578      0.193      0.247      0.124     0.0514

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 39/150      19.6G      1.716      2.921      2.241      1.852          3       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.31it/s]
               all         46        133       0.16      0.279      0.153     0.0665      0.151      0.276       0.14     0.0562

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 40/150      19.6G      1.701      2.967      2.173       1.85          2       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.37it/s]
               all         46        133      0.232       0.33      0.196     0.0941      0.231      0.277      0.192     0.0806

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 41/150      19.6G      1.686      2.873      2.124      1.825          2       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133      0.271      0.282      0.194     0.0823      0.239      0.251      0.162     0.0624

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 42/150      19.6G      1.656      2.903       2.16      1.835          5       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.213      0.269      0.139     0.0523       0.24      0.281      0.139     0.0459

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 43/150      19.6G      1.643      2.782      2.087      1.807         13       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.31it/s]
               all         46        133      0.234      0.251      0.165     0.0711      0.245      0.338      0.188     0.0689

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 44/150      19.6G      1.669      2.836        2.1      1.827          9       2176: 100% 376/376 [04:33<00:00,  1.37it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.49it/s]
               all         46        133        0.2      0.276      0.178       0.08      0.197      0.353      0.185     0.0796

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 45/150      19.6G      1.729      2.885      2.096       1.79          6       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.35it/s]
               all         46        133      0.206      0.257      0.161     0.0686      0.189      0.295      0.172     0.0589

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 46/150      19.6G      1.702       2.82      2.067       1.77          4       2176: 100% 376/376 [04:31<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.34it/s]
               all         46        133      0.256      0.265      0.173     0.0814      0.266       0.28       0.18     0.0683

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 47/150      19.6G      1.659      2.825      2.088      1.803          2       2176: 100% 376/376 [04:31<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.30it/s]
               all         46        133      0.283      0.317      0.203     0.0878      0.273       0.31      0.201      0.075

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 48/150      19.6G      1.615      2.771      1.994      1.798          5       2176: 100% 376/376 [04:33<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.299      0.252       0.22     0.0945      0.303      0.251      0.205     0.0869

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 49/150      19.6G      1.686        2.8      2.007      1.781          5       2176: 100% 376/376 [04:31<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.36it/s]
               all         46        133      0.225      0.272      0.199     0.0936      0.235       0.29      0.197     0.0814

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 50/150      19.6G      1.594      2.739      1.937      1.753          3       2176: 100% 376/376 [04:32<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.36it/s]
               all         46        133      0.237      0.266      0.194     0.0786      0.184      0.285      0.146     0.0503

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 51/150      21.9G      1.633      2.792      1.968      1.766          4       2176: 100% 376/376 [04:33<00:00,  1.38it/s]
             Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100% 23/23 [00:05<00:00,  4.32it/s]
               all         46        133      0.257      0.317      0.214     0.0967      0.254      0.308      0.199     0.0818

  Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size
 52/150      21.9G      1.538      2.693       1.96      1.735          9       2176:  19% 70/376 [00:51<03:55,  1.30it/s][ WARN:0@14348.514] global loadsave.cpp:244 findDecoder imread_('/content/drive/.shortcut-targets-by-id/1of8frlV3H1_GB4M8xYLa96K0MYP-Xjx5/Fire Behavior/Data/YOLOv6/7_20_2023/images/train/DJI_0013-00_00_29_19-Still026_jpg.rf.d032a549c22ad5601cf5e45ca7ef4af2.jpg'): can't open/read file: check file path/integrity
 52/150      21.9G      1.538      2.693       1.96      1.735          9       2176:  19% 70/376 [00:51<03:43,  1.37it/s]

Traceback (most recent call last): File "/usr/local/bin/yolo", line 8, in sys.exit(entrypoint()) File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/cfg/init.py", line 266, in entrypoint getattr(model, mode)(vars(cfg)) File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/engine/model.py", line 214, in train self.trainer.train() File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/engine/trainer.py", line 182, in train self._do_train(int(os.getenv("RANK", -1)), world_size) File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/engine/trainer.py", line 283, in _do_train for i, batch in pbar: File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1178, in iter for obj in iterable: File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in next data = self._next_data() File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 677, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/data/base.py", line 181, in getitem return self.transforms(self.get_label_info(index)) File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/data/base.py", line 186, in get_label_info label["img"], label["ori_shape"], label["resized_shape"] = self.load_image(index) File "/usr/local/lib/python3.10/dist-packages/ultralytics/yolo/data/base.py", line 124, in load_image raise FileNotFoundError(f"Image Not Found {f}") FileNotFoundError: Image Not Found /content/drive/.shortcut-targets-by-id/1of8frlV3H1_GB4M8xYLa96K0MYP-Xjx5/Fire Behavior/Data/YOLOv6/7_20_2023/images/train/DJI_0013-00_00_29_19-Still026_jpg.rf.d032a549c22ad5601cf5e45ca7ef4af2.jpg**

cperry-goog commented 1 year ago

Duping with #3785