ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.79k stars 16.36k forks source link

Failure to start training because the lables are not found. #11131

Closed Bitahsni closed 1 year ago

Bitahsni commented 1 year ago

Search before asking

Question

Hii, I'm trying to train YOLOv5s for a custom dataset using the following command:

!python train.py --img 415 --batch 16 --epochs 30 --data data.yaml --weights yolov5s.pt --cache

but training process does not start due to the following errors:

### The output:

train: weights=yolov5s.pt, cfg=, data=data.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=30, batch_size=16, imgsz=415, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: up to date with https://github.com/ultralytics/yolov5 βœ… YOLOv5 πŸš€ v7.0-117-g85f6019 Python-3.8.10 torch-1.13.1+cu116 CUDA:0 (Tesla T4, 15102MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 ClearML: run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 πŸš€ in ClearML Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 πŸš€ runs in Comet TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/ 2023-03-08 16:10:41.164474: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-03-08 16:10:42.030414: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia 2023-03-08 16:10:42.030533: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia 2023-03-08 16:10:42.030552: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Downloading https://ultralytics.com/assets/Arial.ttf to /root/.config/Ultralytics/Arial.ttf... 100% 755k/755k [00:00<00:00, 86.4MB/s] Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt... 100% 14.1M/14.1M [00:00<00:00, 222MB/s]

Overriding model.yaml nc=80 with nc=3

             from  n    params  module                                  arguments                     

0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 models.common.C3 [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 656896 models.common.SPPF [512, 512, 5]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 21576 models.yolo.Detect [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] Model summary: 214 layers, 7027720 parameters, 7027720 gradients, 16.0 GFLOPs

Transferred 343/349 items from yolov5s.pt AMP: checks passed βœ… WARNING ⚠️ --img-size 415 must be multiple of max stride 32, updating to 416 optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 60 weight(decay=0.0005), 60 bias albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning /content/yolov5/data/images/train/labels... 0 images, 540 backgrounds, 0 corrupt: 100% 540/540 [00:00<00:00, 1579.55it/s] train: WARNING ⚠️ No labels found in /content/yolov5/data/images/train/labels.cache. See https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data train: New cache created: /content/yolov5/data/images/train/labels.cache Traceback (most recent call last): File "train.py", line 640, in main(opt) File "train.py", line 529, in main train(opt.hyp, opt, device, callbacks) File "train.py", line 188, in train train_loader, dataset = create_dataloader(train_path, File "/content/yolov5/utils/dataloaders.py", line 124, in create_dataloader dataset = LoadImagesAndLabels( File "/content/yolov5/utils/dataloaders.py", line 502, in init assert nf > 0 or not augment, f'{prefix}No labels found in {cache_path}, can not start training. {HELP_URL}' AssertionError: train: No labels found in /content/yolov5/data/images/train/labels.cache, can not start training. See https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data

But in the training folder, there are labels.

Additional

No response

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello @Bitahsni, thank you for your interest in YOLOv5 πŸš€! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 πŸš€

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 πŸš€!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
neo133 commented 1 year ago

check your paths...it should be relative to the yolo folder..or give full path

tlong123 commented 1 year ago

I believe the folder structure for training should look like:

dataset

i.e. the label files should be in a separate folder called labels alongside the images folder

neo133 commented 1 year ago

please go through the train with the custom data tutorial...the folder structure you mentioned is in the wrong order.

Train Custom Data

DhruvAwasthi commented 1 year ago

@Bitahsni I am using yolo for object detection, and the following structure works like a charm:

train/  
    images/  
        1.jpg
        2.jpg
    labels/ 
        1.txt
        2.txt 

val/
    images  
        3.jpg
        4.jpg
    labels/ 
        3.txt
        4.txt 

test/  
    images  
        5.jpg
        6.jpg
    labels/ 
        5.txt
        6.txt 

Hope this helps!

Bitahsni commented 1 year ago

check your paths...it should be relative to the yolo folder..or give full path

Hello, thank you for your help, my problem is solved.

Hnshlr commented 1 year ago

What was the solution please? Currently have the same problem

Bitahsni commented 1 year ago

@Hnshlr Hi, that error was for paths, check them again.

SamDaaLamb commented 1 year ago

I have the same issue. I feel like I am giving the full paths in the train.py file. Where exactly am I supposed to give the full path?

glenn-jocher commented 1 year ago

@SamDaaLamb in the train.py file, make sure that you provide the full path for the following variables:

  1. --data argument: Specify the full path to your data.yaml file. The data.yaml file should contain the paths to your training, validation, and testing data.

  2. --cfg argument: Specify the full path to your model's configuration file (e.g., yolov5s.yaml, yolov5m.yaml, etc.).

  3. --weights argument: If you're using a pre-trained model, provide the full path to the weights file (e.g., yolov5s.pt, yolov5m.pt, etc.).

Here's an example of how the command should look like:

python train.py --img 640 --batch 16 --epochs 50 --data /full/path/to/data.yaml --cfg /full/path/to/yolov5s.yaml --weights /full/path/to/weights.pt

Make sure you replace /full/path/to/ with the actual full paths specific to your system.

Let me know if you have any further questions!

SamDaaLamb commented 1 year ago

Unfortunately, as of recently whenever I import my datasets from roboflow into my google colab. The labels folders (in train, test and validation) are called labelTxt (instead of just labels). I have come to notice that whenever the labels folder is called this, the script no longer works giving the error in question above because it is looking for the folder named "labels" obviously. If I change the folder name to "labels" manually a completely different error pops up such as:

AMP: checks passed βœ… Scaled weight_decay = 0.0005 optimizer: SGD with parameter groups 59 weight (no decay), 70 weight, 62 bias albumentations: Blur(always_apply=False, p=0.01, blur_limit=(3, 7)), MedianBlur(always_apply=False, p=0.01, blur_limit=(3, 7)), ToGray(always_apply=False, p=0.01), CLAHE(always_apply=False, p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning '/content/yolov5/train/labels' images and labels...2547 found, 0 missing, 0 empty, 2547 corrupt: 100% 2547/2547 [00:00<00:00, 2717.66it/s] train: WARNING: /content/yolov5/train/images/00211_jpg.rf.aeb48c588755a9b2ae401dea5826323b.jpg: ignoring corrupt image/label: cannot reshape array of size 9 into shape (2) train: WARNING: /content/yolov5/train/images/00271_jpg.rf.c252cd8d1f0e51d3cf67b2de36e331f4.jpg: ignoring corrupt image/label: cannot reshape array of size 9 into shape (2)

...etc...

train: WARNING: /content/yolov5/train/images/scene08533_jpg.rf.f2916785297bef3820ce872b58a6f89f.jpg: ignoring corrupt image/label: cannot reshape array of size 9 into shape (2) train: WARNING: /content/yolov5/train/images/scene08536_jpg.rf.cd3403e8131f7a209420fc2c020d4a67.jpg: ignoring corrupt image/label: cannot reshape array of size 9 into shape (2) train: New cache created: /content/yolov5/train/labels.cache Traceback (most recent call last): File "/content/yolov5/train.py", line 654, in main(opt) File "/content/yolov5/train.py", line 549, in main train(opt.hyp, opt, device, callbacks) File "/content/yolov5/train.py", line 211, in train train_loader, dataset = create_dataloader(train_path, File "/content/yolov5/utils/dataloaders.py", line 114, in create_dataloader dataset = LoadImagesAndLabels( File "/content/yolov5/utils/dataloaders.py", line 470, in init labels, shapes, self.segments = zip(*cache.values()) ValueError: not enough values to unpack (expected 3, got 0) CPU times: user 161 ms, sys: 22.9 ms, total: 184 ms Wall time: 12.4 s

It seems that changing the directory name to "labels" corrupts all the files so I am left with no solution. :(

Does anyone have any suggestions as to the nature of my problem?

glenn-jocher commented 1 year ago

@SamDaaLamb it seems that the issue you're facing is related to the folder name of your labels. The script is expecting the labels folder to be named "labels", but in your case, it is named "labelTxt". When you manually change the folder name to "labels", you encounter a different error regarding corrupt images/labels.

One possible solution to this problem could be modifying the script itself to handle the "labelTxt" folder name. You would need to locate the part of the script where it retrieves the labels folder and update it to match your folder name.

Another suggestion would be to double-check the format of your labels and ensure they are compatible with the script. Make sure the labels are in the correct format and contain the necessary information. This could help resolve the "corrupt image/label" error.

If these suggestions don't solve your problem, it might be helpful to provide additional details about your dataset preparation process and any specific steps or code that you're using. This would allow the community to better understand your situation and provide more targeted assistance.

sg-vd commented 1 year ago

Search before asking

* [x]  I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and [discussions](https://github.com/ultralytics/yolov5/discussions) and found no similar questions.

Question

Hii, I'm trying to train YOLOv5s for a custom dataset using the following command:

!python train.py --img 415 --batch 16 --epochs 30 --data data.yaml --weights yolov5s.pt --cache

but training process does not start due to the following errors:

### The output:

train: weights=yolov5s.pt, cfg=, data=data.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=30, batch_size=16, imgsz=415, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: up to date with https://github.com/ultralytics/yolov5 βœ… YOLOv5 πŸš€ v7.0-117-g85f6019 Python-3.8.10 torch-1.13.1+cu116 CUDA:0 (Tesla T4, 15102MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 ClearML: run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 πŸš€ in ClearML Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 πŸš€ runs in Comet TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/ 2023-03-08 16:10:41.164474: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-03-08 16:10:42.030414: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia 2023-03-08 16:10:42.030533: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia 2023-03-08 16:10:42.030552: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Downloading https://ultralytics.com/assets/Arial.ttf to /root/.config/Ultralytics/Arial.ttf... 100% 755k/755k [00:00<00:00, 86.4MB/s] Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt... 100% 14.1M/14.1M [00:00<00:00, 222MB/s]

Overriding model.yaml nc=80 with nc=3

             from  n    params  module                                  arguments                     

0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2] 1 -1 1 18560 models.common.Conv [32, 64, 3, 2] 2 -1 1 18816 models.common.C3 [64, 64, 1] 3 -1 1 73984 models.common.Conv [64, 128, 3, 2] 4 -1 2 115712 models.common.C3 [128, 128, 2] 5 -1 1 295424 models.common.Conv [128, 256, 3, 2] 6 -1 3 625152 models.common.C3 [256, 256, 3] 7 -1 1 1180672 models.common.Conv [256, 512, 3, 2] 8 -1 1 1182720 models.common.C3 [512, 512, 1] 9 -1 1 656896 models.common.SPPF [512, 512, 5] 10 -1 1 131584 models.common.Conv [512, 256, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 361984 models.common.C3 [512, 256, 1, False] 14 -1 1 33024 models.common.Conv [256, 128, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 90880 models.common.C3 [256, 128, 1, False] 18 -1 1 147712 models.common.Conv [128, 128, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 296448 models.common.C3 [256, 256, 1, False] 21 -1 1 590336 models.common.Conv [256, 256, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 1182720 models.common.C3 [512, 512, 1, False] 24 [17, 20, 23] 1 21576 models.yolo.Detect [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] Model summary: 214 layers, 7027720 parameters, 7027720 gradients, 16.0 GFLOPs

Transferred 343/349 items from yolov5s.pt AMP: checks passed βœ… WARNING ⚠️ --img-size 415 must be multiple of max stride 32, updating to 416 optimizer: SGD(lr=0.01) with parameter groups 57 weight(decay=0.0), 60 weight(decay=0.0005), 60 bias albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning /content/yolov5/data/images/train/labels... 0 images, 540 backgrounds, 0 corrupt: 100% 540/540 [00:00<00:00, 1579.55it/s] train: WARNING ⚠️ No labels found in /content/yolov5/data/images/train/labels.cache. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data train: New cache created: /content/yolov5/data/images/train/labels.cache Traceback (most recent call last): File "train.py", line 640, in main(opt) File "train.py", line 529, in main train(opt.hyp, opt, device, callbacks) File "train.py", line 188, in train train_loader, dataset = create_dataloader(train_path, File "/content/yolov5/utils/dataloaders.py", line 124, in create_dataloader dataset = LoadImagesAndLabels( File "/content/yolov5/utils/dataloaders.py", line 502, in init assert nf > 0 or not augment, f'{prefix}No labels found in {cache_path}, can not start training. {HELP_URL}' AssertionError: train: No labels found in /content/yolov5/data/images/train/labels.cache, can not start training. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data

But in the training folder, there are labels.

Additional

No response

hi, can you pls help how did you resolve this?

glenn-jocher commented 1 year ago

@sg-vd hi, it seems like the error message indicates that the script is unable to find any labels in the specified folder. Although you mentioned that there are labels present in the training folder, the script is not able to detect them.

To troubleshoot this issue, you can try the following steps:

  1. Double-check the path to the labels folder and ensure that it is correct and matches the specified location in the script.
  2. Verify that the labels are in the correct format and contain the necessary information, such as bounding box coordinates and class labels.
  3. Ensure that the labels are saved with the correct file extension (e.g., .txt or .xml) and specific naming conventions if applicable.
  4. Confirm that the labels are stored in the same directory structure as the corresponding images, or specified correctly in the data.yaml file.

If you have already tried these steps and the issue persists, please provide more details about your directory structure, labeling format, and any specific steps or modifications you have made. This additional information will help the community better understand your situation and provide more specific guidance.

RajaBersiung commented 12 months ago

HELP ME how to solve this error. i was doing a sign language recognition, i used the labbelmg software for the image annotation with YOLO label which is .txt. i run the code in the google colab and this error was occur. the path in the data.yaml was corect and the name between the images and labels were same. how to solve this problem

/content/yolov5 2023-11-13 16:03:54.263624: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-11-13 16:03:54.263686: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-11-13 16:03:54.263725: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered train: weights=yolov5s.pt, cfg=./models/custom_yolov5s.yaml, data=/content/data.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=10, batch_size=16, imgsz=416, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=yolov5s_results, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest github: up to date with https://github.com/ultralytics/yolov5 βœ… YOLOv5 πŸš€ v7.0-240-g84ec8b5 Python-3.10.12 torch-2.1.0+cu118 CUDA:0 (Tesla T4, 15102MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0 Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 πŸš€ runs in Comet TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/

             from  n    params  module                                  arguments                     

0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 3 161152 models.common.BottleneckCSP [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 641792 models.common.BottleneckCSP [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 378624 models.common.BottleneckCSP [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 95104 models.common.BottleneckCSP [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 313088 models.common.BottleneckCSP [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
24 [17, 20, 23] 1 21576 models.yolo.Detect [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] custom_YOLOv5s summary: 233 layers, 7260488 parameters, 7260488 gradients

Transferred 223/369 items from yolov5s.pt AMP: checks passed βœ… optimizer: SGD(lr=0.01) with parameter groups 59 weight(decay=0.0), 70 weight(decay=0.0005), 62 bias albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) train: Scanning /content/EMSL/Train/Images... 0 images, 90 backgrounds, 0 corrupt: 100% 90/90 [00:00<00:00, 1258.42it/s] train: WARNING ⚠️ No labels found in /content/EMSL/Train/Images.cache. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data train: New cache created: /content/EMSL/Train/Images.cache Traceback (most recent call last): File "/content/yolov5/train.py", line 647, in main(opt) File "/content/yolov5/train.py", line 536, in main train(opt.hyp, opt, device, callbacks) File "/content/yolov5/train.py", line 195, in train train_loader, dataset = create_dataloader(train_path, File "/content/yolov5/utils/dataloaders.py", line 124, in create_dataloader dataset = LoadImagesAndLabels( File "/content/yolov5/utils/dataloaders.py", line 502, in init assert nf > 0 or not augment, f'{prefix}No labels found in {cache_path}, can not start training. {HELP_URL}' AssertionError: train: No labels found in /content/EMSL/Train/Images.cache, can not start training. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data CPU times: user 72.5 ms, sys: 19.5 ms, total: 92.1 ms Wall time: 10.6 s

glenn-jocher commented 12 months ago

@RajaBersiung it seems like the error is related to the script being unable to find any labels in the specified folder. Here are some steps you can take to troubleshoot this issue:

  1. Double-check the path to the labels folder and ensure that it is correctly specified in the data.yaml file.
  2. Verify that the labels are in the correct format and contain the necessary information, such as bounding box coordinates and class labels.
  3. Ensure that the labels are saved with the correct file extension (e.g., .txt) and have the same file names as the corresponding images.
  4. Confirm that the labels are stored in the same directory structure as the corresponding images.

If you have already verified the above points and the issue persists, please share the directory structure of your dataset, the format of your labels, and any specific modifications you have made to the YOLOv5 training script. This additional information will help in diagnosing the problem more accurately.

RajaBersiung commented 12 months ago

how the arrangement of the dataset should be?

is it like this?

glenn-jocher commented 12 months ago

@RajaBersiung yes, that's correct. The dataset should be organized as follows:

YourDatasetFolder
  - images
    - train
      - image1.jpg
      - image2.jpg
      - ...
    - test
      - image_test1.jpg
      - image_test2.jpg
      - ...
  - labels
    - train
      - image1.txt
      - image2.txt
      - ...
    - test
      - image_test1.txt
      - image_test2.txt
      - ...

In this structure, the images are organized into "train" and "test" folders, and the corresponding label files are stored in the "labels" directory following the same "train" and "test" subdirectory structure. This arrangement allows the YOLOv5 training script to locate the images and their associated labels correctly during the training process.

RajaBersiung commented 12 months ago

this is my content in the data.yaml

train: /content/Train/Images val: /content/Test/Images

nc: 3 names: ['air', 'demam', 'degar']

According to the ultralystics yolov5 github, it was differ with mine. the yolov5 github use this format :

path: ../datasets/coco128 # dataset root dir train: images/train2017 # train images (relative to 'path') 128 images val: images/train2017 # val images (relative to 'path') 128 images test: # test images (optional)

Classes

names: 0: person 1: bicycle 2: car

so is it same ffor both format for the data.yaml?

glenn-jocher commented 12 months ago

@RajaBersiung the format for the data.yaml file in YOLOv5 can vary slightly based on your specific use case. The format you have provided:

train: /content/Train/Images
val: /content/Test/Images
nc: 3
names: ['air', 'demam', 'degar']

is perfectly valid and suitable for your custom dataset with 3 classes named 'air', 'demam', and 'degar'.

The example from the YOLOv5 GitHub repository that you mentioned is also valid, and it is usually used with the COCO dataset or datasets following a similar structure.

In summary, both formats for the data.yaml file are appropriate, and you should use the one that aligns with your dataset's directory structure and class names.

YngMgC commented 7 months ago

Hi,I do have the same question right now. I have built right structure,and it can be trained locally.But when I upload it to colab and change the path in data.yaml ,it still shows that: Traceback (most recent call last): File "/content/drive/MyDrive/yolov5/train.py", line 848, in main(opt) File "/content/drive/MyDrive/yolov5/train.py", line 623, in main train(opt.hyp, opt, device, callbacks) File "/content/drive/MyDrive/yolov5/train.py", line 254, in train train_loader, dataset = create_dataloader( File "/content/drive/MyDrive/yolov5/utils/dataloaders.py", line 181, in create_dataloader dataset = LoadImagesAndLabels( File "/content/drive/MyDrive/yolov5/utils/dataloaders.py", line 604, in init assert nf > 0 or not augment, f"{prefix}No labels found in {cache_path}, can not start training. {HELP_URL}" AssertionError: train: No labels found in /content/drive/MyDrive/yolov5/dataset/data/labels/train.cache, can not start training. I think I have built a right structure of the dataset ,here's the structure

8f2c927dfbaff5776aa9e57f8e3d192

But it does not work,showing the same information that cache has no labels in it. How could I solve it.Thank you.

glenn-jocher commented 7 months ago

@YngMgC hi there! 😊 It looks like you've done a great job in setting up your dataset. Since you can train locally but encounter issues on Colab after changing the path in data.yaml, let's ensure the paths are correctly pointed to your dataset on Google Drive.

Please verify that your data.yaml paths are absolute and correctly reference the dataset location on Google Drive. Sometimes, a small typo or incorrect path can lead to this error. Here's a quick example to double-check:

train: /content/drive/MyDrive/your_dataset_folder/images/train
val: /content/drive/MyDrive/your_dataset_folder/images/val

Additionally, running the following command in Colab helps ensure that your dataset structure is correct and accessible:

!ls /content/drive/MyDrive/your_dataset_folder/images/train

This should list your image files, confirming that the path is correct. If the images appear but the issue persists, try clearing the cache by deleting any .cache files in your dataset directory, then rerun your training command.

Let us know how it goes!

YngMgC commented 7 months ago

Hello, thank you for your reply. I checked my file paths and there is no problem. After that, I used the command you provided to check and found that some pictures were missing. Fortunately, the missing files were easy to find. So I upload my dataset again.The upload speed is a bit slow, but I believe that's the reason for the error.

Thank you for your advice. It is necessary to use the "ls" command to check whether the file exists in the path. Thank you for your guidance, I would have spent a lot of time discovering this issue without your help.

glenn-jocher commented 7 months ago

Hi there! 😊 You're most welcome, and I'm so glad to hear the ls command helped you identify the missing files. Yes, file paths can be quite tricky, especially when working in environments like Colab where paths need to be exact. Slow upload speeds can indeed be a bit of a hurdle, but sounds like you're on the right track now.

If you encounter any more questions or run into other hurdles along the way, feel free to reach out. The YOLO community and the Ultralytics team are here to help. Best of luck with your training! πŸš€

thomasjubin commented 6 months ago

I appreciate the helpful guidance. Following your instructions, I managed to address the issue I was facing successfully.

glenn-jocher commented 6 months ago

I'm thrilled to hear that the guidance was helpful and that you were able to resolve your issue! 😊 If you have any more questions or run into new challenges in the future, don't hesitate to reach out. Happy coding, and best of luck with your projects! πŸš€