Problem with 90:10:0 and 70:30:0 dataset proportions

Leprechault commented 2 years ago

Search before asking

[X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

Hi Everyone!

I'd like to test a custom model using 90:10:0 and 70:30:0 dataset proportions, but if I try to not use val file or create a val file empty the output is:

Caching images (0.5GB): 100% 3129/3129 [00:02<00:00, 1498.57it/s]
Traceback (most recent call last):
  File "/content/yolov5/utils/datasets.py", line 362, in __init__
    assert self.img_files, 'No images found'
AssertionError: No images found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 492, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 196, in train
    rank=-1, world_size=opt.world_size, workers=opt.workers)[0]  # testloader
  File "/content/yolov5/utils/datasets.py", line 70, in create_dataloader
    image_weights=image_weights)
  File "/content/yolov5/utils/datasets.py", line 364, in __init__
    raise Exception('Error loading data from %s: %s\nSee %s' % (path, e, help_url))
Exception: Error loading data from /content/img/val: No images found

What kind of changes do I need to make in the train.py file for testing my model without the use of the val files?

Thanks in advance!

Additional

No response

glenn-jocher commented 2 years ago

@Leprechault 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:

1.1 Create dataset.yaml

COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path and relative paths to train / val / test image directories (or *.txt files with image paths), 2) the number of classes nc and 3) a list of class names:

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
nc: 80  # number of classes
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
         'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
         'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
         'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
         'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
         'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
         'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
         'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
         'hair drier', 'toothbrush' ]  # class names

1.2 Create Labels

After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are:

One row per object
Each row is class x_center y_center width height format.
Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
Class numbers are zero-indexed (start from 0).

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):

1.3 Organize Directories

Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128 is inside a /datasets directory next to the /yolov5 directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/ in each image path with /labels/. For example:

../datasets/coco128/images/im0.jpg  # image
../datasets/coco128/labels/im0.txt  # label

Good luck 🍀 and let us know if you have any other questions!

Leprechault commented 2 years ago

Thanks, @glenn-jocher, sometimes I make confusion between the file used during the training process if is train and val. Then val is used for training and test for analysing model results.

ultralytics / yolov5