Four channel image training

Sophia-11 commented 3 years ago

❔Question

Hello, may I ask you a question about the yolov5? if I have two pictures and I want to combine them into 4 channels for training, what should I do?Thank you for your help

github-actions[bot] commented 3 years ago

👋 Hello @Sophia-11, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 3 years ago

@Sophia-11 I don't understand how 2 images can be combined with 4 channels. Can you supply details or point to documentation/sources for this use case?

Sophia-11 commented 3 years ago

@Sophia-11 I don't understand how 2 images can be combined with 4 channels. Can you supply details or point to documentation/sources for this use case?

Sorry, I didn't describe it clearly. I have two data sources. The first data has 3 channels of RGB, and the second data source is a single channel. The two data have been registered. The two data sources have the same bounding box. How to train these 4 channels at the same time? Thank you! @glenn-jocher

glenn-jocher commented 3 years ago

@Sophia-11 YOLOv5 models can be created with any input channel count: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/models/yolo.py#L66-L67

The dataloader is designed for 3 channel images however, and includes augmentations for 3 channel images such as HSV augmentation, so you'd need to customize the dataloader to suit your requirements if you wanted to train 4 channel images: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/utils/datasets.py#L341

Sophia-11 commented 3 years ago

@Sophia-11 YOLOv5 models can be created with any input channel count: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/models/yolo.py#L66-L67

The dataloader is designed for 3 channel images however, and includes augmentations for 3 channel images such as HSV augmentation, so you'd need to customize the dataloader to suit your requirements if you wanted to train 4 channel images: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/utils/datasets.py#L341

Thanks a lot! I'll have a try.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ingin97 commented 2 years ago

Who where you @Sophia-11? What did you see?!

(I have the same use case and problem :) )

wangfurong123 commented 2 years ago

@Sophia-11 Hello, I am also learning to use YOLOv5 to train four-channel images. How did you change the data loader? Can you share the method? Thanks!

glenn-jocher commented 2 years ago

@Sophia-11 @ingin97 @wangfurong123 this is on our TODO list, tentatively expect this to be completed in the next few months.

I think the way we'll have to do this is add an ch: field to the model.yaml, and then pass this to the dataloader.

WARNING: the dataloader will skip all images that are not ch compliant, and HSV augmentation will be disabled if ch!=3.

ingin97 commented 2 years ago

We managed to achieve 4 channel training by:

Making the dataloader both load the RGB image containing 3 channels, and a IR image in a seperate folder containing just 1 channel.
Making a new YOLO model, which takes both as input to the forward call, then to concatenate the tensors before running the inference.

But this is probably not the best way to do it generically.

wangfurong123 commented 2 years ago

@ingin97 Hi, if it is convenient, can you share the code for training 4 channel data? This will help me a lot, I look forward to your sharing, thank you.

ingin97 commented 2 years ago

@wangfurong123 We did not initially think of sharing it when developed, as it is just a proof-of-concept. So some parts is not as generic as the yolov5 repo.

In the four-input branch you can find the model FourInputModel. I hope that if you place your secondary images in a folder called 'ir' beside 'images' and 'labels', it will be able to recognize them and you can use the train.py script from command as usual.

Also there exist a generic-fusion branch that instead runs two parallell backbones before fusing the results. We are hoping this does give slightly better results, than just concatenating the two images, but we have not gotten our first results yet.

wangfurong123 commented 2 years ago

@ingin97 OK. Thank you for your reply.

dreamer-1996 commented 2 years ago

@glenn-jocher Has this TODO being completed?

glenn-jocher commented 2 years ago

No. If it was completed the TODO would be removed a merged PR linked to this issue.

marziyemahmoudifar commented 2 years ago

ingin97

@wangfurong123 We did not initially think of sharing it when developed, as it is just a proof-of-concept. So some parts is not as generic as the yolov5 repo.

In the four-input branch you can find the model FourInputModel. I hope that if you place your secondary images in a folder called 'ir' beside 'images' and 'labels', it will be able to recognize them and you can use the train.py script from command as usual.

Also there exist a generic-fusion branch that instead runs two parallell backbones before fusing the results. We are hoping this does give slightly better results, than just concatenating the two images, but we have not gotten our first results yet.

@wangfurong123 We did not initially think of sharing it when developed, as it is just a proof-of-concept. So some parts is not as generic as the yolov5 repo.

In the four-input branch you can find the model FourInputModel. I hope that if you place your secondary images in a folder called 'ir' beside 'images' and 'labels', it will be able to recognize them and you can use the train.py script from command as usual.

Also there exist a generic-fusion branch that instead runs two parallell backbones before fusing the results. We are hoping this does give slightly better results, than just concatenating the two images, but we have not gotten our first results yet.

Hi, I want to train YOLOv5 with 4 channels input, Can I use the repository you mentioned in this address: https://github.com/sebastianvitterso/master-sau/blob/four-input/yolov5/models/yolo_four_input.py ? thank you..

urbansound8K commented 2 years ago

anyone managed this thing to be for real?

how to train with four channels?

lihlong commented 1 year ago

how, it's already 2023!

willwang-cv commented 1 year ago

how, it's already 2023!

ajithvcoder commented 1 year ago

anyone managed this thing to be for real?

how to train with four channels?

@lihlong @willwang-ai i tested it on yolov8 it works if you add below line ch: 4

in yolov8s.yaml or any .yaml file (yolovxx.yaml) you would be able to train it. if anyone has any hint how to add data augmentations please let me know

glenn-jocher commented 1 year ago

@ajithvallabai it's great to hear that you were able to successfully train YOLOv8 with four channels by adding the ch: 4 line in the configuration file. For data augmentations, you can refer to the data.yaml file in the YOLOv5 repository where you can specify various augmentation techniques like random cropping, flipping, scaling, etc. In addition, you can also consider augmentations specifically for four-channel images, such as adjusting the intensity of the infrared channel or augmenting the RGB and infrared channels separately.

ajithvcoder commented 1 year ago

@glenn-jocher thanks for your response . could you provide a example data.yaml file there is no data.yaml file in YoloV5 repo

i tried below in YoloV8 repo but it doesnt seem to work. only things which i change in default.yaml is working like shear, perspective etc..

train: /path/to/train/images
val: /path/to/val/images

nc: 80
names: ['class1', 'class2', 'class3', ...]

# Data augmentation
train_transforms:
  - RandomFlip()
  - RandomRotate(degrees=10)
  - RandomContrast(0.5, 1.5)
  - RandomBrightness(0.5, 1.5)
  - RandomSaturation(0.5, 1.5)
  - Resize((640, 640), pad_mode='letterbox')
  - ToTensor()

# Validation transforms
val_transforms:
  - Resize((640, 640), pad_mode='letterbox')
  - ToTensor()

malore350 commented 1 year ago

anyone managed this thing to be for real? how to train with four channels?

@lihlong @willwang-ai i tested it on yolov8 it works if you add below line ch: 4

in yolov8s.yaml or any .yaml file (yolovxx.yaml) you would be able to train it. if anyone has any hint how to add data augmentations please let me know

When I place ch:4 in the yolov8.yaml file, the training does indeed run and successfully finish, but once I validate the model (using model.val()), it gives such an error:

Given groups=1, weight of size [80, 4, 3, 3], expected input[1, 3, 800, 800] to have 4 channels, but got 3 channels instead Did you have such an issue?

glenn-jocher commented 1 year ago

@malore350 the error you encountered suggests that there is a mismatch between the number of channels in your model's weights and the input image. Since you added ch: 4 to the config file, the model expects the input image to have 4 channels instead of the default 3 channels.

To resolve this issue, you need to ensure that your dataset provides images with 4 channels. If your dataset only contains RGB images with 3 channels, you won't be able to train a model with 4 channels using the YOLOv5 repository without modifying the code accordingly.

In order to work with 4-channel images, you may need to make changes to the model architecture, specifically in the _C constructor and the forward pass. Additionally, you would also need to modify the data pipeline to handle 4-channel images appropriately.

Keep in mind that modifying the codebase could be a complex task and may entail making changes to multiple files, so a deep understanding of the codebase would be necessary.

Alicia-hou commented 9 months ago

@Sophia-11 YOLOv5 models can be created with any input channel count: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/models/yolo.py#L66-L67

The dataloader is designed for 3 channel images however, and includes augmentations for 3 channel images such as HSV augmentation, so you'd need to customize the dataloader to suit your requirements if you wanted to train 4 channel images: https://github.com/ultralytics/yolov5/blob/005d7a8c54a39d89bf2b9dc03fba82a489cd0628/utils/datasets.py#L341

Thanks a lot! I'll have a try.

hi. Have you successfully trained YOLOv5 with four channels? I have the same use case and problem. Appreciate it if you could update your findings about this question.

glenn-jocher commented 3 weeks ago

To handle 30-channel images, you'll need to modify the dataloader and validation scripts to accommodate the additional channels. Specifically, check the datasets.py and plots.py files to ensure they process and visualize 30-channel images correctly. Adjust any operations that assume 3 channels to work with your specific channel count.

ultralytics / yolov5