Closed bryanbocao closed 2 years ago
@bryanbo-cao 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. You could just use one data.yaml and bash script to rename your directories between each of the 16 trainings.
For examples of using image directories instead of txt lists of images see other datasets like VOC.yaml: https://github.com/ultralytics/yolov5/blob/d6051382f1551455b88ca086b99275cfc8286131/data/VOC.yaml#L1-L21
To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:
COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path
and relative paths to train
/ val
/ test
image directories (or *.txt files with image paths), 2) the number of classes nc
and 3) a list of class names
:
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128 # dataset root dir
train: images/train2017 # train images (relative to 'path') 128 images
val: images/train2017 # val images (relative to 'path') 128 images
test: # test images (optional)
# Classes
nc: 80 # number of classes
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush' ] # class names
After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt
file per image (if no objects in image, no *.txt
file is required). The *.txt
file specifications are:
class x_center y_center width height
format.x_center
and width
by image width, and y_center
and height
by image height.The label file corresponding to the above image contains 2 persons (class 0
) and a tie (class 27
):
Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128
is inside a /datasets
directory next to the /yolov5
directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/
in each image path with /labels/
. For example:
../datasets/coco128/images/im0.jpg # image
../datasets/coco128/labels/im0.txt # label
Good luck 🍀 and let us know if you have any other questions!
@glenn-jocher, thanks for pointing it again. I have read this document and succeeded in different custom datasets many times but I am afraid it didn't answer my specific question. The document is about 1 dataset while I am asking N variants of 1 dataset that share the same dataset root dir
without duplicating.
In the above example,
../datasets/coco128/images/im0.jpg # image
../datasets/coco128/labels/im0.txt # label
The folder name labels
seems to be fixed by default. This document does not specify how to change it if I have labels_v2
or labels_v3
in the same folder:
../datasets/coco128/images/im0.jpg # image
../datasets/coco128/labels/im0.txt # label
../datasets/coco128/labels_v2/im0.txt # label_v2
../datasets/coco128/labels_v3/im0.txt # label_v3
Thanks!
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
In the above example,
../datasets/coco128/images/im0.jpg # image ../datasets/coco128/labels/im0.txt # label
The folder name
labels
seems to be fixed by default. This document does not specify how to change it if I havelabels_v2
orlabels_v3
in the same folder:../datasets/coco128/images/im0.jpg # image ../datasets/coco128/labels/im0.txt # label ../datasets/coco128/labels_v2/im0.txt # label_v2 ../datasets/coco128/labels_v3/im0.txt # label_v3
Thanks!
Hello, bryanbocao! Did you find a way to solve your problem? I have a same problem now. I would be grateful if you share your experience.
I have the same problem that I wanna specify path of labels directory.
However, from the source code, this featue is not supported currently, because labels directory is auto generated from xxx/images/xxx
images directory, which is what official documents say.
and
@chobits hello! Thank you for bringing that to our attention. The labels directory's current auto-generation from the images directory is indeed in line with the current behavior. While specifying a separate path for labels isn't currently supported, your feedback has been duly noted and will be taken into account for future improvements.
Feel free to keep an eye on the release notes and documentation updates for any future changes. We appreciate your understanding and patience!
@chobits hello! Thank you for bringing that to our attention. The labels directory's current auto-generation from the images directory is indeed in line with the current behavior. While specifying a separate path for labels isn't currently supported, your feedback has been duly noted and will be taken into account for future improvements.
Feel free to keep an eye on the release notes and documentation updates for any future changes. We appreciate your understanding and patience!
Any updates on this? Seems like a small fix, should have been fixed by now, since it is one of the most basic input feature.
@chobits hello! Thank you for bringing that to our attention. The labels directory's current auto-generation from the images directory is indeed in line with the current behavior. While specifying a separate path for labels isn't currently supported, your feedback has been duly noted and will be taken into account for future improvements. Feel free to keep an eye on the release notes and documentation updates for any future changes. We appreciate your understanding and patience!
Any updates on this? Seems like a small fix, should have been fixed by now, since it is one of the most basic input feature.
I fixed it by modifying my local branch at that time. It's been a long time since I last recalled the context. I didn't verify the official update, but still, thanks you for your work.
Hello! Thanks for checking back on this. As of now, there hasn't been an official update to support specifying separate paths for the labels directory directly through the configuration. We understand the importance of this feature and appreciate your input, which helps in enhancing the functionality of YOLOv5.
If there are updates regarding this feature, they'll be included in the release notes and documentation. Thanks once again for your patience and for being a part of the YOLOv5 community! 🌟
Hello! Thanks for checking back on this. As of now, there hasn't been an official update to support specifying separate paths for the labels directory directly through the configuration....
Ok, I understood.
If there are updates regarding this feature, they'll be included in the release notes and documentation. Thanks once again for your patience and for being a part of the YOLOv5 community! 🌟
Cool! Looking forward to seeing the new features.
Hello! We're glad to hear your enthusiasm and appreciate your support! We'll definitely keep the community updated on any new features and enhancements. If you have any more questions or need further assistance in the meantime, don't hesitate to ask. Happy coding! 😊🚀
Search before asking
Question
Hello! I like the way this repo organize! I was trying to do some sort of "grid search" for investigating performance of Yolo. Specifically, I have a
base
coco dataset, the one exactly downloaded by the script incoco.yaml
and would like to have variations on two levels: (1) in the image data level, I do some image processing and have different sets of image, sayimages_v2, images_v3, images_v4
whileimages
is the base one; (2) in the label level for bbox, I also have different variations such as changing label names, number of classes or category ids saved in various sets of label folders:labels_v2, labels_v3, labels_v4
.Below is a brief structure of files in
dataset/coco
:By "Grid Search" I mean I will have one result for each pair of
images*
andlabels*
, resulting in4(images) x 4(labels) =16
sets of experiments in total.Q1: Is there any way to do that efficiently?
A straight forward way is to have
16
datasets ofcoco
likecoco_1
,coco_2
,coco_3
while each corresponds to one pair ofimages*
andlabels*
. However, it requires16 x 20.1GB=321.6GB
space which is too much for me.When sweeping
images*
, it seems that I can just change the image paths intrain2017.txt
andval2017.txt
, but the default label path islabels
and I don't see I can specify the path in https://github.com/ultralytics/yolov5/blob/master/data/coco.yaml. Q2: Is there any way to do that?Appreciate your help!
Additional
No response