Closed bach05 closed 2 years ago
👋 Hello @bach05, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.
Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@bach05 python train.py --cache disk
will cache any image format to *.npy files for faster reads during training.
@glenn-jocher Yes, that's ok. But I have a dataset of multichannel images saved in .npy format. Is it possible to train on this dataset?
If I try to launch python train.py
, it gives me the error 'No images or videos found in {p}. Supported formats are:\nimages: {IMG_FORMATS}\nvideos: {VID_FORMATS}'
.
I tried to add to IMG_FORMATS
the extention .npy, but it gives me another error '{prefix}No labels in {cache_path}. Can not train without labels. See {HELP_URL}'
.
@bach05 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. Yes, you'll need labels to train. Full guide below.
To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:
COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path
and relative paths to train
/ val
/ test
image directories (or *.txt files with image paths), 2) the number of classes nc
and 3) a list of class names
:
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128 # dataset root dir
train: images/train2017 # train images (relative to 'path') 128 images
val: images/train2017 # val images (relative to 'path') 128 images
test: # test images (optional)
# Classes
nc: 80 # number of classes
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush' ] # class names
After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt
file per image (if no objects in image, no *.txt
file is required). The *.txt
file specifications are:
class x_center y_center width height
format.x_center
and width
by image width, and y_center
and height
by image height.The label file corresponding to the above image contains 2 persons (class 0
) and a tie (class 27
):
Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128
is inside a /datasets
directory next to the /yolov5
directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/
in each image path with /labels/
. For example:
../datasets/coco128/images/im0.jpg # image
../datasets/coco128/labels/im0.txt # label
Good luck 🍀 and let us know if you have any other questions!
@glenn-jocher Thanks, actually I have followed your guidelines and I have my labels. In fact, the training works with .jpg files. But I would like to train on .npy files.
The problem arise when I save images on .npy files. The train.py
script seems not able to handle them. Here a minimum example of a dataset to reproduce the problem.
I would like to launch tryining directly on .npy files (each file contains a WxHxC matrix which is a multichannel image). Is it possible? Where do I need to modify the code?
Hey! I would also like to train on ".npy" files with an array of numbers instead of images. My dataset is a 1 channel float64 array (WxHx1). The data structure is close to example @bach05.
Hey! I am also trying to do the same.
My dataset images have 6 channels and I can not save them in any image format. Therefore, I have saved them in .npy format. However, no method is provided to train on such files. If someone can give a hint on how or where to modify the code to make it work it would be awesome.
Hi @MattiaMolon , the only work aroud I found is saving the multichannel files in TIFF format which support "multi-page" images. In the same way @cyber-pizdec can save his mono-channel images in JPG or PNG format.
I can't save my uint32 array to png, in PIL it is saved as uint16. While reading tiff uint32 I am facing this error: opencv-python\opencv-python\opencv\modules\imgcodecs\src\grfmt_tiff.cpp (462) cv::TiffDecoder::readData OpenCV TIFF: TIFFRGBAImageOK: Sorry, can not handle images with 32-bit samples Please advise how to get around this limitation.
Hi @bach05, Thank you very much for the suggestion. I am now able to start the training loop. However, I now have another problem: the number of channels seems to be hardcoded to 3 in the train.py
file and in the Dataloader. Even if I upload a custom model with:
model = torch.hub.load(
"ultralytics/yolov5", model_name, autoshape=False, classes=n_classes, channels=n_channels,
)
The Dataloader seems to drop the 3 additional channels automatically. Therefore I get the following error while forwarding an image into the model:
File "my_train.py", line 670, in <module>
main(opt)
File "my_train.py", line 565, in main
train(opt.hyp, opt, device, callbacks)
File "my_train.py", line 351, in train
pred = model(imgs) # forward
File "yolov5/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "yolov5/models/yolo.py", line 135, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File "yolov5/models/yolo.py", line 158, in _forward_once
x = m(x) # run
File "yolov5/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "yolov5/models/common.py", line 50, in forward_fuse
return self.act(self.conv(x))
File "yolov5/.venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "yolov5/.venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "yolov5/.venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [32, 6, 6, 6], expected input[16, 3, 128, 128] to have 6 channels, but got 3 channels instead
Did you run into the same issue? I have found #2177 suggesting modifying the Dataloader. Was this the solution for you?
@MattiaMolon I was wondering to solve that problem, but now I am stopped due to other duties. I hope I can resume soon working on it!
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
Following this as well. I would like to load the data from .npy file. how and where do I need to change in the code? Any help please?
@Monibsediqi to load data from .npy files in YOLOv5, you can make the following modifications:
In the data.yaml
file, update the train
and val
paths to point to your .npy files instead of image directories.
In the datasets.py
file, update the LoadImagesAndLabels
class. Replace the image loading logic with code to load .npy files using np.load()
.
Modify the load_image()
function in the utils\general.py
file to handle .npy files appropriately. You can use np.load()
to load the .npy file and convert it to the required tensor format.
Note that you might also need to make changes to the input dimensions and the number of channels in the model architecture to match your .npy file format.
Hope this helps! Let me know if you have any further questions.
Thank you so much for your prompt reply. Really appreciate it. I will follow the steps that you've mentioned and get back to you with the outcome.
@Monibsediqi you're welcome! I'm glad I could help. Feel free to reach out if you have any further questions or need any assistance. Good luck with your modifications, and I look forward to hearing about the outcome.
I have a dataset containing pairs of images taken simultaneously by IR and Thermal cameras. I need to modify the model architecture to take these two images as input and detect the object present in them. The model should then provide two images as output in response. How do I make changes?
@VaishviShah hi,
To modify the YOLOv5 model architecture to take pairs of images from IR and Thermal cameras as input, you can follow these steps:
Update the data pipeline: Modify the datasets.py
file to load both IR and Thermal images as input. You can use the cv2
library or any other image processing library to read both images and concatenate them together.
Modify the model architecture: In the models\common.py
file, update the network architecture to take the concatenated input images. You can modify the YoloV5
class and its forward
method to handle the two input images appropriately.
Update the training script: In the train.py
script, make sure to pass both IR and Thermal images to the model during training and evaluation. You can modify the Dataloader
to load the images accordingly.
By following these steps, you should be able to modify the YOLOv5 model architecture to handle pairs of IR and Thermal images as input and provide two images as output. Let me know if you need further help.
Happy coding!
Do you have any method to change YOLO v8 architecture?
On Tue, Aug 1, 2023, 17:38 Glenn Jocher @.***> wrote:
@VaishviShah https://github.com/VaishviShah hi,
To modify the YOLOv5 model architecture to take pairs of images from IR and Thermal cameras as input, you can follow these steps:
1.
Update the data pipeline: Modify the datasets.py file to load both IR and Thermal images as input. You can use the cv2 library or any other image processing library to read both images and concatenate them together. 2.
Modify the model architecture: In the models\common.py file, update the network architecture to take the concatenated input images. You can modify the YoloV5 class and its forward method to handle the two input images appropriately. 3.
Update the training script: In the train.py script, make sure to pass both IR and Thermal images to the model during training and evaluation. You can modify the Dataloader to load the images accordingly.
By following these steps, you should be able to modify the YOLOv5 model architecture to handle pairs of IR and Thermal images as input and provide two images as output. Let me know if you need further help.
Happy coding!
— Reply to this email directly, view it on GitHub https://github.com/ultralytics/yolov5/issues/7316#issuecomment-1660182532, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUAIBMAFEFTLC3VAABYQVULXTDWVNANCNFSM5SWQWC5A . You are receiving this because you were mentioned.Message ID: @.***>
@VaishviShah to modify the YOLOv5 model architecture and create YOLOv8, you will need to make significant modifications to the codebase.
Here is a high-level overview of the steps you can follow:
Update the backbone network: You can change the backbone network architecture, such as using a different feature extractor or using a different backbone altogether.
Modify the detection head: Depending on the specific changes you want to make, you may need to modify the detection head to accommodate the new architecture. This includes changing the number of anchor boxes, adjusting the number of output channels, and modifying the prediction layers.
Adjust the architecture configuration: Make sure to update the model configuration file to reflect the changes you made in the backbone network and detection head. This includes updating the number of output channels, anchor sizes, and strides.
Train the model: Once you have modified the architecture, you can train the model using your dataset. Make sure to handle any changes in image preprocessing, data loading, and data augmentation.
Please note that creating a YOLOv8 model involves significant modifications and could require substantial effort. It is recommended to have a good understanding of the YOLOv5 codebase and deep learning principles before attempting such modifications.
I hope this guidance helps you in your modifications. Feel free to reach out if you have any further questions.
Search before asking
Description
I would like to be able to train on .npy files. Each files contains a multi-channel image (4 or more channels). Currently the train.py script does not support this format.
Use case
Multichannel image processing
Additional
Just a suggestion on how to modify the source code is enough.
Are you willing to submit a PR?