ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.22k stars 16.21k forks source link

how to use Background images in training? #2844

Closed yustaub closed 3 years ago

yustaub commented 3 years ago

❔Question

Hello, sir. in Tips for Best Training Results, you recommend about 0-10% background images to help reduce FPs, how to use Background images in training? just add Background images into training images or add Background images into training images and add corresponding empty txt labels into training labels? Very appreciate for your reply!!

github-actions[bot] commented 3 years ago

👋 Hello @yustaub, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 3 years ago

@yustaub to use background images in training you simply add these images to your dataset. No labels are required for background images.

yustaub commented 3 years ago

ok, thanks for your reply!! I will try it.

glenn-jocher commented 3 years ago

@yustaub no problem. I've updated the tutorial with a note about the background images not needing labels.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

LUOBO123LUOBO123 commented 3 years ago

@glenn-jocher hi,sir. if image is background images,in your code,label is setting to [[0,0,0,0,0]],background's classification is 0 ?How do I understand that?

glenn-jocher commented 3 years ago

@LUOBO123LUOBO123 background images do not need any labels, you simply add them to your images directory. See below for our full training recommendations, including notes on background images.

Most of the time good results can be obtained with no changes to the models or training settings, provided your dataset is sufficiently large and well labelled. If at first you don't get good results, there are steps you might be able to take to improve, but we always recommend users first train with all default settings before considering any changes. This helps establish a performance baseline and spot areas for improvement.

If you have questions about your training results we recommend you provide the maximum amount of information possible if you expect a helpful response, including results plots (train losses, val losses, P, R, mAP), PR curve, confusion matrix, training mosaics, test results and dataset statistics images such as labels.png. All of these are located in your project/name directory, typically yolov5/runs/train/exp.

We've put together a full guide for users looking to get the best results on their YOLOv5 trainings below.

Dataset

COCO Analysis

Model Selection

Larger models like YOLOv5x and YOLOv5x6 will produce better results in nearly all cases, but have more parameters, require more CUDA memory to train, and are slower to run. For mobile deployments we recommend YOLOv5s/m, for cloud deployments we recommend YOLOv5l/x. See our README table for a full comparison of all models.

YOLOv5 Models

Training Settings

Before modifying anything, first train with default settings to establish a performance baseline. A full list of train.py settings can be found in the train.py argparser.

Further Reading

If you'd like to know more a good place to start is Karpathy's 'Recipe for Training Neural Networks', which has great ideas for training that apply broadly across all ML domains: http://karpathy.github.io/2019/04/25/recipe/

LUOBO123LUOBO123 commented 3 years ago

@LUOBO123LUOBO123 background images do not need any labels, you simply add them to your images directory. See below for our full training recommendations, including notes on background images.

wave Hello! Thanks for asking about improving training results. Most of the time good results can be obtained with no changes to the models or training settings, provided your dataset is sufficiently large and well labelled. If at first you don't get good results, there are steps you might be able to take to improve, but we always recommend users first train with all default settings before considering any changes. This helps establish a performance baseline and spot areas for improvement.

If you have questions about your training results we recommend you provide the maximum amount of information possible if you expect a helpful response, including results plots (train losses, val losses, P, R, mAP), PR curve, confusion matrix, training mosaics, test results and dataset statistics images such as labels.png. All of these are located in your project/name directory, typically yolov5/runs/train/exp.

We've put together a full guide for users looking to get the best results on their YOLOv5 trainings below.

Dataset

* **Images per class.** ≥1.5k images per class

* **Instances per class.** ≥10k instances (labeled objects) per class total

* **Image variety.** Must be representative of deployed environment. For real-world use cases we recommend images from different times of day, different seasons, different weather, different lighting, different angles, different sources (scraped online, collected locally, different cameras) etc.

* **Label consistency.** All instances of all classes in all images must be labelled. Partial labelling will not work.

* **Label accuracy.** Labels must closely enclose each object. No space should exist between an object and it's bounding box. No objects should be missing a label.

* **Background images.** Background images are images with no objects that are added to a dataset to reduce False Positives (FP). We recommend about 0-10% background images to help reduce FPs (COCO has 1000 background images for reference, 1% of the total).

COCO Analysis

Model Selection

Larger models like YOLOv5x and YOLOv5x6 will produce better results in nearly all cases, but have more parameters, require more CUDA memory to train, and are slower to run. For mobile deployments we recommend YOLOv5s/m, for cloud deployments we recommend YOLOv5l/x. See our README table for a full comparison of all models.

YOLOv5 Models
* **Start from Pretrained weights.** Recommended for small to medium sized datasets (i.e. VOC, VisDrone, GlobalWheat). Pass the name of the model to the `--weights` argument. Models download automatically from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases).
python train.py --data custom.yaml --weights yolov5s.pt
                                             yolov5m.pt
                                             yolov5l.pt
                                             yolov5x.pt
* **Start from Scratch.** Recommended for large datasets (i.e. COCO, Objects365, OIv6). Pass the model architecture yaml you are interested in, along with an empty `--weights ''` argument:
python train.py --data custom.yaml --weights '' --cfg yolov5s.yaml
                                                      yolov5m.yaml
                                                      yolov5l.yaml
                                                      yolov5x.yaml

Training Settings

Before modifying anything, first train with default settings to establish a performance baseline. A full list of train.py settings can be found in the train.py argparser.

* **Epochs.** Start with 300 epochs. If this overfits early then you can reduce epochs. If overfitting does not occur after 300 epochs, train longer, i.e. 600, 1200 etc epochs.

* **Image size.** COCO trains at native resolution of `--img 640`, though due to the high amount of small objects in the dataset it can benefit from training at higher resolutions such as `--img 1280`. If there are many small objects then custom datasets will benefit from training at native or higher resolution. Best inference results are obtained at the same `--img` as the training was run at, i.e. if you train at `--img 1280` you should also test and detect at `--img 1280`.

* **Batch size.** Use the largest `--batch-size` that your hardware allows for. Small batch sizes produce poor batchnorm statistics and should be avoided.

* **Hyperparameters.** Default hyperparameters are in [hyp.scratch.yaml](https://github.com/ultralytics/yolov5/blob/master/data/hyp.scratch.yaml). We recommend you train with default hyperparameters first before thinking of modifying any. In general, increasing augmentation hyperparameters will reduce and delay overfitting, allowing for longer trainings and higher final mAP. Reduction in loss component gain hyperparameters like `hyp['obj']` will help reduce overfitting in those specific loss components. For an automated method of optimizing these hyperparameters, see our [Hyperparameter Evolution Tutorial](https://docs.ultralytics.com/yolov5/tutorials/hyperparameter_evolution).

Further Reading

If you'd like to know more a good place to start is Karpathy's 'Recipe for Training Neural Networks', which has great ideas for training that apply broadly across all ML domains: http://karpathy.github.io/2019/04/25/recipe/

Thanks for your reply.

q1135718080 commented 3 years ago

Hello, sir. in Tips for Best Training Results, you recommend about 0-10% background images to help reduce FPs。Will too many background images affect the result of training?For example,20%

glenn-jocher commented 3 years ago

@q1135718080 obviously as your background image fraction trends towards 1.0 your training will suffer. It's up to you to determine the best mix for your custom dataset.

tino926 commented 3 years ago

@ q1135718080, I trained a model on a subset of coco. From my experience, more background images improve the detection results on my own data.

loucif01 commented 2 years ago

@glenn-jocher thank you for explaining, but i want to know how to do this because if we just add background images as you said to dataset without labels, it will generate errors about shape of input images and their targets because some images (background) have no targets, so the training will generate errors? !!!

glenn-jocher commented 2 years ago

@loucif01 see https://github.com/ultralytics/yolov5/issues/2844#issuecomment-851338384

HJoonKwon commented 2 years ago

@glenn-jocher Then background images can't be in the validation set? (Should they be only in the training set?) I just did it and got zero precision and recall. Thank you in advance.

glenn-jocher commented 2 years ago

@HJoonKwon you can also include them in your validation set.

HJoonKwon commented 2 years ago

@glenn-jocher Thank you for your reply. I should debug it.

glenn-jocher commented 2 years ago

@HJoonKwon no problem. COCO for example has about 1000 background images out of 118k in train set and 48 out of 5000 in val set:

Screen Shot 2021-12-27 at 6 34 19 PM
YoungjaeDev commented 1 year ago

@glenn-jocher

  1. Do you know that an image without a label and an empty image with a label are equally recognized as the same Background?
  2. My log says there are 2232 background images 15325 images, 2232 backgrounds, 0 corrupt: 100%|██████████| 15325/15325 [00:00<?, ?it/s]

    and the training time ... 0/49 22.7G 0.08518 0.03828 0 169 1280: 100%|██████████| 479/479 [24:59<00:00, 3.13s/it]

479 * 32 (Batch_size) = 15328 Background is not included in Batch, can you tell me where the code is trained?

glenn-jocher commented 1 year ago

@youngjae-avikus

  1. Correct, an image without a label and an image with an empty label are both considered background images.

  2. There are different ways to train the model and the exact code differs depending on factors such as batch size, number of GPUs, etc. However, in general, background images are not included in the positive training samples during training, so they will not be in a batch. Instead, the model will learn to recognize them as background when making predictions during inference.

YoungjaeDev commented 1 year ago

@glenn-jocher thank you for your quick reply Finally, can you give me the line of code for the part you mentioned in answer?

glenn-jocher commented 1 year ago

@youngjae-avikus Sure! In the YOLOv5 training script, background images are not included during training because they have empty labels, i.e., the bounding boxes are not defined in the label files.

Here is an example of how to define the format of the label file in YOLOv5:

class_dict = {}
for i, cls in enumerate(class_list):
    class_dict[cls] = i

...
yield f"{img_file_path}", np.array(label_file), num_boxes

In this example, label_file is a NumPy array of shape (num_boxes, 5), where each row corresponds to a bounding box and contains the label index (0-based), and the box coordinates (x_center, y_center, width, height). If an image has no label boxes, then the label_file should be an empty array:

label_file = np.zeros((0, 5), dtype=np.float32)

This ensures that the background images are skipped during the training loop, while still allowing the model to recognize background objects during inference.

ranamohamed12 commented 1 year ago

If I add background images to the training set, the background always appears in the confusion matrix?!!! because I want to remove this class from the confusion matrix

glenn-jocher commented 1 year ago

@ranamohamed12 it's important to note that background images should not have bounding boxes labeled. The model should learn to recognize the absence of objects as "background" during inference. If you include background images with labeled bounding boxes, then the model will learn to predict these as a separate class, leading to confusion in the classification. Therefore, it's recommended to use unlabeled background images for training. If you already have trained the model with background images with labeled bounding boxes, you may want to retrain the model without the background images and see if it improves the confusion matrix.

ranamohamed12 commented 1 year ago

I face a problem in the training phase .. my dataset consists of wide-view images of roads that have labels for only specific objects to be detected ... in training results confusion matrix show a background class that is not a class in my dataset
how can i solve it ?

glenn-jocher commented 1 year ago

@ranamohamed12 hello,

The confusion matrix showing a background class may be caused by including images in the training set that do not contain any objects to be detected but have been incorrectly labeled. In YOLOv5, any images without labeled bounding boxes will be considered background, so including these unlabeled images in the training set could lead to the model learning to recognize them as an additional class.

To address this issue, you should check your dataset to ensure that all images without labeled bounding boxes are excluded from the training set. Additionally, if your dataset includes images with incorrect labeling, you should correct these labels or exclude these images from the training set altogether.

I hope this helps! If you have any further questions or concerns, please feel free to ask.

ranamohamed12 commented 1 year ago

I will do this …. Thank you for your reply

ranamohamed12 commented 1 year ago

I excluded all images without labelling bounding boxes. then I did the training ... the confusion matrix wasn't founded (i trained the dataset for only 3 epochs to know if the background class was removed or not just a check) image

glenn-jocher commented 1 year ago

@ranamohamed12 thank you for your update! It's great to hear that you were able to remove the background class from the confusion matrix by excluding the images without labeled bounding boxes. Checking the confusion matrix after a few epochs of training is a good way to confirm that the change has been effective. I hope that your future training of the dataset goes smoothly and produces accurate results. If you have any further questions or concerns, please don't hesitate to ask.

YoungjaeDev commented 1 year ago

@glenn-jocher

I have a question For example, let's assume that I extracted 100 images from a video.The scene for 100 will be similar because they are extracted from the same video. However, assuming that there is no object in 25 images, and assuming that there is an object in 75 images, putting it as a background image for 25 images does not seem to have a good effect. What do you think? It was a fragmentary experiment, but if I put background in a similar scene without an object, the performance will be poor in a similar scene with objects.

glenn-jocher commented 1 year ago

@ranamohamed12 hello,

In your scenario, if you label 25 images without objects as background images, you may encounter issues during training, as the model may learn to identify these specific scenes as background and may fail to generalize to similar scenes with objects. It's important to ensure that background images are representative of the scenes the model will encounter during inference, in order to improve generalization and avoid overfitting.

In your case, you could consider a few options:

  1. Use data augmentation techniques, such as random cropping, scaling, and rotation, to generate additional background images from the 75 images that contain objects. This would help to ensure that your background images are representative of the scenes that the model will encounter during inference.

  2. An alternative approach would be to use a pre-trained model, such as the coco.yaml, which includes a "Background" class. This would allow you to use all 100 images for training, without having to explicitly label images as background.

I hope you find this information helpful. If you have any further questions or concerns, please don't hesitate to ask.

YoungjaeDev commented 1 year ago

@glenn-jocher

thank you

  1. In the case of case 1 you mentioned, it seems that Mosaic-based augmentation already has background training for 75 images. Is that correct?
  2. Now, I know that the background image does not participate in training, only inference. But I would like to know exactly how inference helps. Does it work in a way that suppresses it by adding additional computer vision algorithms in addition to deep learning?
glenn-jocher commented 1 year ago

@youngjae-avikus hello,

  1. Yes, Mosaic-based data augmentation can use the object images to facilitate background augmentation, resulting in a greater variety of images without the need for explicit background images.

  2. During inference, the YOLOv5 model detects objects in the input image by running a forward pass through the model's neural network, which is trained using object images during the training phase. In the absence of an object, the model will output a null prediction. The network is not specifically designed to suppress background images, but rather to detect the presence of objects within an image. The model's ability to detect objects in images is based on pattern recognition, which is learned during the training phase through gradient-based optimization of the loss function.

I hope this helps! If you have any further questions or concerns, please don't hesitate to ask.

Hwaaan2 commented 1 year ago

For the background image, do I put any image without a labeling value into the dataset?

glenn-jocher commented 1 year ago

@Hwaaan2 hello,

Yes, you can include images without labeled objects as background images in your YOLOv5 training dataset. This can help reduce false positives during inference by providing the model with examples of scenes without objects to recognize. However, it's important to ensure that background images are representative of the scenes the model will encounter during inference, in order to improve generalization and avoid overfitting.

To include a background class in your dataset, you can label images without objects with a unique background class value, such as "0". During training, the model will learn to distinguish between objects and the background class based on the labeled values.

I hope this helps! If you have any further questions or concerns, please don't hesitate to ask.

Thank you.

Razamalik4497 commented 1 year ago

I added only background images into the directories train/images & valid/images, but when I trained the model it showed me missing instead of background images

Google Colab: train: Scanning '/content/drive/MyDrive/yolov5/train' images and labels...438 found, 339 missing, 0 empty, 0 corrupt:

glenn-jocher commented 1 year ago

@Razamalik4497 hi,

It seems that the YOLOv5 training code is expecting images to be present in the same directories as the labels, and is reporting the presence of missing images (as well as empty and corrupt files) during the scanning process. In your case, since you only included background images in the train/images and valid/images directories, the model is reporting missing images for the object images that were not included in the dataset.

To avoid this issue, you can consider creating separate directories for your object and background images, and then manually specifying these directories using the --img-dir and --bg-dir arguments during training. By doing so, the model should be able to properly recognize and use your background images without reporting missing object images.

I hope this helps! If you have any further questions or concerns, please don't hesitate to ask.

Thanks.

Razamalik4497 commented 1 year ago

thanks for your quick response, unfortunately I didnt find this --img-dir and --bg-dir argument in train.py file, that's why its giving me error during training

Razamalik4497 commented 1 year ago

command : python train.py --img 416 --batch 1 --epochs 1 --data data.yaml --cfg models/yolov5n.yaml --name yolov5s_results --weights yolov5n.pt --img-dir data --bg-dir back_data

glenn-jocher commented 1 year ago

@Razamalik4497 hello,

The --img-dir and --bg-dir arguments are not currently implemented in YOLOv5 training, which is why you are receiving an error when attempting to use them. To incorporate background images into your training pipeline, you can include them directly alongside your object images in the train/images and valid/images directories, and then label them accordingly as "0" using empty .txt files.

Additionally, it's important to ensure that your background images are representative of the scenes the model will encounter during inference, in order to improve generalization and avoid overfitting. You may also want to consider using data augmentation techniques, such as random cropping, scaling, and rotation, to generate additional background images.

I hope this clarifies the situation! If you have any further questions or concerns, please don't hesitate to ask.

Best regards.

Razamalik4497 commented 1 year ago

thanks now it's working when I put the empty labels in the labels directory,

glenn-jocher commented 1 year ago

@Razamalik4497 great to hear that it's working now with your empty labels in the labels directory! This is the intended behavior in YOLOv5 when using background images. By assigning your background images the label value "0" using empty labels, you can effectively train the model to distinguish between objects and background scenes, which can help reduce false positives during inference. If you have any further questions or concerns, please don't hesitate to ask.

Razamalik4497 commented 1 year ago

Hi, I faced another problem when I run training on my PC & on google colab both side has the same dataset same command but the PC is detecting background image but google colab is not detecting,

Google colab : train: Scanning '/content/drive/MyDrive/yolov5/train/labels' images and labels...33836 found, 315 missing, 1007 empty, 0 corrupt: 100% 34151/34151 [04:42<00:00, 121.04it/s] train: New cache created: /content/drive/MyDrive/yolov5/train/labels.cache val: Scanning '/content/drive/MyDrive/yolov5/valid/labels' images and labels...9347 found, 265 missing, 406 empty, 0 corrupt: 100% 9612/9612 [01:13<00:00, 130.55it/s] val: New cache created: /content/drive/MyDrive/yolov5/valid/labels.cache Plotting labels to runs/train/yolov5s_results/labels.jpg...

My PC: train: Scanning D:\P-R-O-J-E-C-T\Racket\13th training\dataset\racket_data\racket_data_new\train\labels... 33836 images, train: New cache created: D:\P-R-O-J-E-C-T\Racket\13th training\dataset\racket_data\racket_data_new\train\labels.cache val: Scanning D:\P-R-O-J-E-C-T\Racket\13th training\dataset\racket_data\racket_data_new\valid\labels... 9347 images, 671 backgrounds, 0 corrupt: 10 val: New cache created: D:\P-R-O-J-E-C-T\Racket\13th training\dataset\racket_data\racket_data_new\valid\labels.cache

AutoAnchor: 3.53 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset Plotting labels to runs\train\yolov5s_results\labels.jpg...

glenn-jocher commented 1 year ago

@Razamalik4497 hi, it's possible that the difference in the number of missing images between your PC and Google Colab is due to missing or incorrect file paths in your dataset on Colab. Double-check that your training and validation dataset directories are specified correctly in your code on Colab, and that your dataset files are available and accessible from your Colab environment.

Another possibility is that there may be differences in how the file system works on your PC and on Colab, which may lead to differences in how the YOLOv5 training pipeline scans and processes your images. You may want to try copying your entire dataset to Colab and running training from there to see if the issue persists.

I hope this helps! Let me know if you have any other concerns or issues.

Guillermo-Ingles commented 1 year ago

I would like to know if it is beneficial or not to include background images (unlabeled images) in the validation or test set, or if it is only good to include them in the training set. Additionally, I would also like to know if it's good practice for background images to make up more than 10% of the total, as the documentation says between 0 and 10% but I don't know why I can't include more.

glenn-jocher commented 1 year ago

@Guillermo-Ingles hello,

Including background images (unlabeled images) in the validation or test set can be beneficial in order to evaluate the model's performance in distinguishing between objects and background scenes. It helps assess the model's ability to avoid false positives. Therefore, it is recommended to include background images in both the training and validation/test sets.

As for the percentage of background images in the dataset, the documentation suggests keeping it between 0% and 10%. This range ensures that the model is trained primarily on object images, allowing it to learn to classify objects accurately. Including too many background images may affect the model's ability to detect objects effectively. However, it is worth noting that the exact percentage may vary depending on your specific use case and the nature of your dataset.

I hope this clarifies your concerns. If you have any further questions or require additional assistance, please feel free to ask.

Baro1502 commented 1 year ago

Hi @glenn-jocher, I have a question with YOLOv8.

I am training YOLOv8 for a single class, with no background images in training and about 80% percentage of background images in validation set. I wanna ask if I use model.val(), will the model try to predict the background images? Below is my confusion matrix which is generated automatically using the function mention above. And beside is my manually calculated confusion matrix. I wonder why the FP value is so high. The code I used for FP is:

if true_label == 0 and pred_label == 1:
            confusion_matrix[0,1] += 1  # False positive

Supposed 2 variables true_label and pred_label are integer and assigned based on the presence of .txt file in the labels folder.

Thank you for your time helping me out!

ocomputer commented 1 year ago

Hi! I have a project where I need to 1) change the background on yolov5 to just one solid color as well as 2) blur the background to help just identify certain objects. I already trained my data set to identify the objects that I want, but i don't know how to complete the other tasks. please help!

glenn-jocher commented 1 year ago

@ocomputer hello!

To change the background of your YOLOv5 detections to a solid color, you can modify the code that handles visualization. In the file utils/general.py, you can find the plot_one_box function which is responsible for drawing bounding boxes. You can change the fill parameter of the cv2.rectangle function to the desired color.

Regarding blurring the background, you can use OpenCV's image processing functions to apply blur to the background. You can create a mask of the objects detected by YOLOv5, then blur the remaining portion of the image using the cv2.blur or cv2.GaussianBlur functions. Finally, you can combine the blurred background with the original objects using the mask.

Remember to experiment with different parameters and techniques to achieve the desired effect. Feel free to ask if you have any specific questions or need further assistance.

Mps24-7uk commented 1 year ago

Hi @glenn-jocher ,

I would like know that how background images handles False positive and which part of yolov5 code utilizes the background images for training?

tino926 commented 1 year ago

Hi @glenn-jocher ,

I would like know that how background images handles False positive and which part of yolov5 code utilizes the background images for training?

you can check class ComputeLoss in loss.py. for each prediction on a background image, the objectness is oppressed.