--image-weights and background images

tino926 commented 6 months ago

Search before asking

[X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

If I use --image-weights, does the training process ignore background images?

in train.py, I noticed the following code snippet:

        if opt.image_weights:
            cw = model.class_weights.cpu().numpy() * (1 - maps) ** 2 / nc  # class weights
            iw = labels_to_image_weights(dataset.labels, nc=nc, class_weights=cw)  # image weights
            dataset.indices = random.choices(range(dataset.n), weights=iw, k=dataset.n)  # rand weighted idx

It seems that for a background image. It's corresponding weight will always be 0. Consequently, it won’t be selected during the training process. Is the description correct?

Additional

No response

glenn-jocher commented 6 months ago

@tino926 hello! Thanks for your question regarding the --image-weights flag during training with YOLOv5.

The --image-weights option is designed to sample images for training with a probability proportional to the number of targets in each image, and inversely proportional to the class frequency. This means that images with rare classes or many objects might be sampled more frequently to help the model learn these classes better.

For images that only contain background (no objects), their weights would indeed be very low or zero, as they do not contribute to the class-specific learning. However, they are not explicitly ignored; they simply have a lower probability of being sampled compared to images with objects. This is intended to help the model focus on learning from images that provide more information about the classes of interest.

If you want to ensure that background images are included in the training process, you might consider not using the --image-weights flag, which will result in a uniform sampling of images, including those with only background.

Remember, the balance between learning from object-containing images and background images is crucial for a well-performing model, so consider your dataset's specifics when deciding on using image weights.

For more detailed information on training options, please refer to our documentation. Keep up the great work with YOLOv5, and happy training! 😊🚀

tino926 commented 6 months ago

The function, labels_to_image_weights, is defined as follows:

def labels_to_image_weights(labels, nc=80, class_weights=np.ones(80)):
    # Produces image weights based on class_weights and image contents
    # Usage: index = random.choices(range(n), weights=image_weights, k=1)  # weighted image sample
    class_counts = np.array([np.bincount(x[:, 0].astype(int), minlength=nc) for x in labels])
    return (class_weights.reshape(1, nc) * class_counts).sum(1)

Then I am almost certain that the weight corresponding to a background image is zero

Therefore the background images' indices won't occur in the dataset.indices created by random.choices(). This seems a contradiction to "they simply have a lower probability of being sampled compared to images with objects".

Perhaps this is a bug, or perhaps this is a limitation that should be documented?

glenn-jocher commented 6 months ago

@tino926, thanks for the follow-up and for diving deeper into the code. Your analysis is indeed accurate based on the labels_to_image_weights function you've outlined. If an image has no labels (i.e., it's a background image), its weight for selection during training with --image-weights enabled would effectively be zero. This means such images would not be sampled for training under this setting, as their indices would not be included in dataset.indices after the random.choices() call.

This behavior is a consequence of the current implementation aimed at emphasizing learning from images with labeled objects, especially those from underrepresented classes. It's not a bug per se but a specific design choice for the --image-weights feature. However, I appreciate your point that this could be more clearly documented, as understanding the impact of --image-weights on training data selection is crucial for users.

We always aim to improve YOLOv5 and its documentation based on user feedback. I'll take your suggestion back to the team to consider how we can better document this behavior or explore adjustments to how background images are handled with --image-weights in future updates.

Thank you for bringing this to our attention, and please continue to share any further insights or questions you might have! Your contributions help make YOLOv5 better for everyone. 🙌🚀

github-actions[bot] commented 5 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

ultralytics / yolov5