Yolov5 confusion matrix with background FP=1 and TN=0

yiluny217 commented 1 year ago

Hello,

I was training a model to detect trucks in pictures and here is the result confusion matrix of my val data. Following the convention of reading a confusion matrix, I'll call TP for the upper left cell, FP for upper right cell, FN for lower left cell and TN for lower right cell. (please ignore the class 'bicycle' and 'person' because the original dataset only have trucks labeled but 'truck' was assigned a 'class=2' during the manual annotation)

confusion_matrix

For the 'background', there are 1 and 0. I searched online and found a lot of people are having the same issue, here are some examples: yolov5 issue 10365 yolov5 issue 1665 stackoverflow In yolov5 issue 1665, I noticed @glenn-jocher gave a brief explanation that 'columns are normalized', but I'm still quite confused. My I get a more clear explanation about why it happens and is there a possible way to fix it?

Another thing bothering me is that actually I didn't have any annotation of background in my training data, so I guess that's why TN=0?

github-actions[bot] commented 1 year ago

👋 Hello @yiluny217, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

github-actions[bot] commented 1 year ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://docs.ultralytics.com/yolov5
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/hub
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

jbezovsek commented 1 year ago

Hello,

I was training a model to detect trucks in pictures and here is the result confusion matrix of my val data. Following the convention of reading a confusion matrix, I'll call TP for the upper left cell, FP for upper right cell, FN for lower left cell and TN for lower right cell. (please ignore the class 'bicycle' and 'person' because the original dataset only have trucks labeled but 'truck' was assigned a 'class=2' during the manual annotation)

For the 'background', there are 1 and 0. I searched online and found a lot of people are having the same issue, here are some examples: yolov5 issue 10365 yolov5 issue 1665 stackoverflow In yolov5 issue 1665, I noticed @glenn-jocher gave a brief explanation that 'columns are normalized', but I'm still quite confused. My I get a more clear explanation about why it happens and is there a possible way to fix it?

Another thing bothering me is that actually I didn't have any annotation of background in my training data, so I guess that's why TN=0?

I have the same issue of interpreting this kind of results, however my interpretation would be that the columns do not depend on each other as you could assume in 2x2 simple confusion matrix. The 2x2 confusion matrix that I have in mind should ideally look like [1, 0; 0 1]. Because the columns are normalized, the sum of the columns has to be 1, but the sum of the rows can be over 1. In your case I would say that if the actual object was truck, the model predicted truck in 74% and in 26% predicted background. I am still a little confused about the background, because @glenn-jocher here: https://github.com/ultralytics/yolov5/issues/1665#issuecomment-1219468154 said, that the background is not predicted, so this could be the reason for the results in background column?

glenn-jocher commented 1 year ago

@jbezovsek thank you for sharing your concerns about interpreting the confusion matrix for YOLOv5. It can be confusing to understand the results of the matrix, especially when dealing with single-class detection and background.

You are correct that the columns of this matrix do not necessarily depend on each other, as in a simple 2x2 confusion matrix. In the case of YOLOv5, the columns represent the predicted classes, and they are normalized. As a result, the sum of each of the columns would be equal to 1.

Regarding the background class, according to the YOLOv5 developers, it is not predicted by the model. Therefore, it is possible that the TN value being 0 is a result of not having any true negative samples for the background class in the validation set.

Once again, thank you for your question, and please let us know if you have any further concerns about YOLOv5 or vision AI in general.

ilhamalvindo commented 1 year ago

Hello @glenn-jocher, is that mean for single class that has FP=1 and TN=0 on background class is because we don't have any samples of background on validation set. Not because our model is wrong, right?

Because i have the same issue above, my FP=1 and TN=0 for single class label

glenn-jocher commented 1 year ago

Hello @ilhamalvindo, thank you for reaching out with your question regarding YOLOv5's confusion matrix and understanding the results.

You are correct. In a single-class detection with background, if the TN value is 0, it likely means that there were no true negative samples for the background class in the validation set. Therefore, it does not necessarily imply that your model is wrong. Similarly, if the FP value is 1, it could mean that there was only one false positive detected as a background.

Please let me know if you have any further questions or concerns.

ilhamalvindo commented 1 year ago

Thank you for the explanation! @glenn-jocher

glenn-jocher commented 1 year ago

@ilhamalvindo you're welcome, happy to be of help! If you have any further questions or issues with YOLOv5, please don't hesitate to ask.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

RyanTNN commented 11 months ago

confusion_matrix Hello @glenn-jocher . I have some questions. I quit not understanding the background FP and the background FN. This confusion matrix shows background FP 0.77 and background FN 0.22.

What is exactly the meaning of background FP and FN?
Does it affect on the prediction? Why does it affect on the prediction or why does it not affect on the prediction?
As you know, changing conf 0.25 or 0.9 that only changes the object accuracy but does not change the background FP or FN. why?
how can I reduce background FP and FN? Thank you!

glenn-jocher commented 11 months ago

@RyanTNN hi there! It seems like your link to the confusion matrix image is not accessible. However, I will still address your questions based on the information provided.

The "background FP" represents the false positive rate for the background class, i.e., the rate at which the model incorrectly predicts the presence of the background class when it's not actually there. On the other hand, "background FN" stands for the false negative rate for the background class, i.e., the rate at which the model fails to detect the background class when it is present in the image.
The background FP and FN can affect the overall performance of the model, especially if the background class is being misclassified frequently, which might lead to incorrect predictions for other classes as well. However, in some scenarios, particularly in single-class detection tasks, the impact might be minimal depending on the specific use case.
Changing the confidence threshold (conf) primarily affects the object accuracy as it determines the minimum confidence score required for an object to be considered as detected. It might not directly impact the background FP or FN if the background class is not being considered in the confidence threshold settings.
To reduce background FP and FN, you may try various techniques such as refining the training data to include more diverse backgrounds, enhancing the model architecture, adjusting training hyperparameters, and possibly introducing data augmentation to expose the model to a wider variety of background scenarios.

Feel free to provide additional details or share the confusion matrix image for a more precise analysis or assistance.

kevinkwabena commented 7 months ago

Why does the confusion matrix give me 100% in the background in yolov7 and v8 confusion_matrix_normalized @glenn-jocher

glenn-jocher commented 7 months ago

@kevinkwabena hello! It seems there might be a misunderstanding. I'm the author and maintainer of the Ultralytics YOLOv5 repository, and YOLOv7 and v8 are not part of the Ultralytics projects. They are developed by different teams and may have different implementations and behaviors.

Regarding your question about a confusion matrix showing 100% in the background, this typically indicates that the model is predicting the background class for all the samples, which could be due to several reasons such as model overfitting, incorrect labeling, or issues with the validation dataset.

For specific help with YOLOv7 or v8, I would recommend reaching out to the maintainers of those repositories or checking their documentation and issues for similar cases and solutions.

If you have questions about YOLOv5, I'd be more than happy to assist you!

Leonardbd commented 7 months ago

confusion_matrix (2)

@glenn-jocher Hi! It seems the issue still persists for YOLOv5, I'm not observing any True Negatives (TN) or False Positives (FP) in the confusion matrix. This occurs despite my validation set including both negative and positive images, though labels are only provided for the positive ones. My YAML file specifies only one class, "smoke".

The problem remains the same whether I'm running train.py or val.py.

glenn-jocher commented 7 months ago

@Leonardbd hello! Thanks for reaching out with your confusion matrix concern. 😊

Based on what you've described, it appears your model isn’t recognizing any True Negatives (TN) or False Positives (FP) because your validation set only includes labels for positive cases ("smoke"). For TN and FP to appear, there need to be instances where your model predicts "no smoke" correctly (TN) or incorrectly (FP), which requires labeled negative (no smoke) images in your dataset.

If you haven't already, ensure your dataset includes explicitly labeled negative images (images without smoke) and that they're correctly referenced in your dataset YAML file. This adjustment should help the model recognize and learn from both the presence and absence of smoke, potentially resolving the issue you're observing with the confusion matrix.

Quick tip: Make sure your 'train' and 'val' keys in the YAML file accurately reflect your dataset's structure, including paths to both positive and negative samples.

Let me know if this helps, or if you have further questions!

Leonardbd commented 7 months ago

@glenn-jocher thank you for your guidance!

Based on your advice, it seems I need to include a separate class in the YAML file for images without smoke, labeled as "nosmoke". I've updated my YAML file accordingly:

names: 0: smoke 1: nosmoke

However, I'm unsure about how to proceed with annotating the negative images (those without smoke). Since these images inherently lack the object of interest and thus don't have bounding boxes, how should their corresponding .txt annotation files be formatted? Should they simply include a "1" to denote the "nosmoke" class, and if so, how do we address the absence of bounding box coordinates?

I appreciate your assistance in clarifying this matter.

glenn-jocher commented 7 months ago

@Leonardbd glad to hear you're making progress! 😊

For annotating negative images (no object of interest like "nosmoke"), you should create an empty .txt annotation file for each image. There's no need to include a "1" or any other class identifier in these files. Simply having the empty .txt file corresponding to the image tells YOLOv5 that there are no objects present in that image.

Here's a quick example:

For a negative image image123.jpg, you'll have a corresponding image123.txt file that is empty.

This effectively informs the model that "image123.jpg" contains no objects to be detected, helping it learn the "nosmoke" instances without needing explicit bounding boxes.

Keep up the good work, and feel free to reach out if you have more questions!

Rashimingo commented 5 months ago

Confusion-matrix I seem to have the same problem as the other people before me but I dont think I understood what is happening since I did include some TN images in my single class dataset to classify and predict PCB which they are called "null" in roboflow, yet the confusion matrix is reading every pic as a PCB and the background is 0 which is kind of suspicious @glenn-jocher can you provide me with any explanation pls and thanks in advance

glenn-jocher commented 5 months ago

Hi there! 👋

It sounds like you're experiencing an issue where the model predicts every image as containing a PCB. If your dataset includes negative samples ("null" in your case) but the confusion matrix still shows zero background detection, it might be due to how these "null" images are annotated or handled during training.

Ensure that:

Your "null" images have corresponding empty annotation files in your dataset. An empty .txt file for these images indicates to the model that there are no objects present.
Your training dataset configuration in your YAML does correctly reference these negative samples alongside the positive ones.

If both these are in place and you're still seeing issues, it might be worthwhile to double-check the balance and variety in your dataset or consider further tuning your model's hyperparameters.

Let me know if this helps or if there's anything else you'd like to explore! 😊

Rashimingo commented 5 months ago

Hello Glenn, I hope you're doing well thanks for the fast reply on my previous email and now I got a new question for you So, I got this confusion matrix for predicting and classifying PCBs from other objects and i was wondering if I could delete the background row and column from it.

Sincerely, Rashed

From: Glenn Jocher @.> Sent: Tuesday, May 7, 2024 8:21 AM To: ultralytics/yolov5 @.> Cc: RashedAshqar @.>; Comment @.> Subject: Re: [ultralytics/yolov5] Yolov5 confusion matrix with background FP=1 and TN=0 (Issue #11194)

Hi there! 👋

It sounds like you're experiencing an issue where the model predicts every image as containing a PCB. If your dataset includes negative samples ("null" in your case) but the confusion matrix still shows zero background detection, it might be due to how these "null" images are annotated or handled during training.

Ensure that:

Your "null" images have corresponding empty annotation files in your dataset. An empty .txt file for these images indicates to the model that there are no objects present.
Your training dataset configuration in your YAML does correctly reference these negative samples alongside the positive ones.

If both these are in place and you're still seeing issues, it might be worthwhile to double-check the balance and variety in your dataset or consider further tuning your model's hyperparameters.

Let me know if this helps or if there's anything else you'd like to explore! 😊

— Reply to this email directly, view it on GitHubhttps://github.com/ultralytics/yolov5/issues/11194#issuecomment-2097468170, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BDDA2CX53VMRILRKP62HFX3ZBBQG7AVCNFSM6AAAAAAWAMIGMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGQ3DQMJXGA. You are receiving this because you commented.Message ID: @.***>

glenn-jocher commented 5 months ago

Hi Rashed,

Good to hear from you! Regarding your question about removing the background row and column from the confusion matrix, it's technically feasible to exclude these in your visualization if they aren't relevant to your analysis. However, doing so might limit the insight you can gain regarding how well your model is distinguishing between PCBs and true negatives (background). Generally, it's useful to see all aspects of your model's performance, including background predictions, to assess any potential biases or issues.

If you still wish to modify the matrix, this typically involves adjusting the code that generates or visualizes the confusion matrix in your analysis scripts. You would remove the data associated with the background before plotting.

Let me know if you need further assistance or specific code help!

Cheers 😊

vishakraj64 commented 4 months ago

Hi @glenn-jocher,

I want to clarify this, for all the anchors we get the object_Score which tells whether the object present in that region or not. if the that object_Score is 0 then it is consider as background - is it right?

In this answer, you have mentioned that there is no background class samples annotated on validation dataset[TN], that's why the FP showing 100% for the object class - that means for eg,

in single class detection, if 14 or any number of anchor regions got detected as an object where it is actually a background region, then it will show 100% as the FP for the object class
in multi class detection, if 15 anchor regions got detected as an object-A and 30 anchor regions got detected as an object-B where those regions are actually a background region, then the FP for object-A is 0.33 and for object-B is 0.66?

Is the above statements are correct, Could you clarify this, thanks

glenn-jocher commented 4 months ago

Hi @vishakraj64,

Thank you for your detailed question! Let's clarify how the object score and background detection work in YOLOv5.

Object Score and Background Detection

In YOLOv5, each anchor box predicts an objectness score, which indicates the likelihood of an object being present in that region. If the objectness score is low (close to 0), it suggests that the region is likely background. However, YOLOv5 does not explicitly classify regions as "background"; instead, it focuses on detecting objects of interest.

Single-Class Detection

For single-class detection, if your model incorrectly predicts objects in regions that are actually background, these will be counted as False Positives (FP). For example:

If 14 anchor regions are incorrectly predicted as containing an object when they are actually background, these will contribute to the FP count for that object class.

Multi-Class Detection

For multi-class detection, the FP calculation is similar but distributed across multiple classes. For example:

If 15 anchor regions are incorrectly predicted as object-A and 30 anchor regions are incorrectly predicted as object-B, the FP rate for each class will be calculated based on the total number of incorrect predictions relative to the total predictions made.

Clarification on Your Statements

Your understanding is mostly correct, but let's refine it:

In single-class detection, if 14 anchor regions are incorrectly predicted as containing an object when they are actually background, these will all contribute to the FP count for that single object class.
In multi-class detection, if 15 anchor regions are incorrectly predicted as object-A and 30 anchor regions are incorrectly predicted as object-B, the FP rates will be calculated based on the total number of incorrect predictions for each class.

Example Calculation

For multi-class detection:

If there are 45 incorrect predictions (15 for object-A and 30 for object-B) out of a total of 45 predictions, the FP rate for object-A would be 15/45 = 0.33 and for object-B would be 30/45 = 0.66.

Next Steps

If you encounter any issues or bugs, please ensure you are using the latest versions of torch and YOLOv5 from our GitHub repository. If the issue persists, providing a minimum reproducible code example will help us investigate further. You can find guidance on creating one here.

Feel free to reach out with any more questions or clarifications!

Lhhiep-maxcode commented 3 months ago

confusion_matrix confusion_matrix_normalized Hi @glenn-jocher I have these 2 confusion matrix (normalized and non-normalized ones). I'm confused that:

In my data.yaml file, i just have only 1 class ("Injury"). However, these confusion matrices automatically adds "Background" class. I'm wondering that if "Background" class means "No object detected"? (The background class is automatically added to muticlass detection also)
I have trained more models on other tasks (like car/motobike detection) and they always have 0.0 for TN (background-background) and 1.0 for FP. Why are they? (I have test my models for real life detection and they do well for both object and no object detected)

glenn-jocher commented 3 months ago

Hi @Lhhiep-maxcode,

Thank you for sharing your confusion matrices and for your detailed questions! Let's address each of your concerns:

1. Background Class in Confusion Matrix

In YOLOv5, the "Background" class in the confusion matrix represents regions where no object is detected. This class is automatically included to help evaluate the model's performance in distinguishing between the presence and absence of objects. Even if your data.yaml file specifies only one class (e.g., "Injury"), the background class is implicitly considered during evaluation to provide a complete picture of the model's performance.

2. Zero True Negatives (TN) and High False Positives (FP)

The issue of having 0.0 for TN and 1.0 for FP in your confusion matrices can be perplexing. Here are a few potential reasons and steps to investigate:

Dataset Composition

Ensure that your validation dataset includes a balanced mix of images with and without the target objects. If your validation set predominantly contains images with objects, the model might not encounter enough negative samples (background) to learn effectively.

Annotation Files

Verify that your negative samples (images without objects) have corresponding empty .txt annotation files. These empty files indicate to the model that no objects are present in those images.

Model Evaluation

If your model performs well in real-life detection scenarios but shows high FP in the confusion matrix, it might be due to the evaluation threshold settings. You can experiment with different confidence thresholds to see if it affects the confusion matrix results.

Example Code for Adjusting Confidence Threshold

You can adjust the confidence threshold during inference to see if it impacts the confusion matrix results:

import torch
from utils.plots import plot_confusion_matrix

# Load model
model = torch.hub.load('ultralytics/yolov5', 'custom', path='path/to/your/model.pt')

# Set confidence threshold
model.conf = 0.5  # Adjust this value as needed

# Run inference
results = model('path/to/your/validation/images')

# Plot confusion matrix
plot_confusion_matrix(results)

Next Steps

Verify Dataset and Annotations: Ensure your validation dataset includes both positive and negative samples, and that the annotations are correctly formatted.
Adjust Confidence Threshold: Experiment with different confidence thresholds to see if it impacts the confusion matrix results.
Update Packages: Ensure you are using the latest versions of torch and YOLOv5 from our GitHub repository.

If the issue persists, providing a minimum reproducible example will help us investigate further. You can find guidance on creating one here.

Feel free to reach out with any more questions or clarifications! 😊

Lhhiep-maxcode commented 3 months ago

@glenn-jocher thank you for the reply I have tried some steps you recommend above: adjusting confidence threshold, checking the equality of images with and without objects and ploting some TN images. It did well on TN images. However the [row: background, col: background] keeps to be 0. I also search some confusion matrices from dicussions above and they are also the same. I think the [background, background] is automatically 0. Am i right?

glenn-jocher commented 3 months ago

Hi @Lhhiep-maxcode,

Thank you for following up and for trying out the recommended steps! 😊

Your observation is correct. In YOLOv5's confusion matrix, the [background, background] cell often appears as 0. This is because the confusion matrix is designed to focus on the detection and classification of objects rather than explicitly tracking true negatives (background correctly identified as background).

Why [Background, Background] is 0

The primary goal of YOLOv5 is to detect and classify objects within images. The confusion matrix is structured to highlight the performance of the model in terms of True Positives (TP), False Positives (FP), and False Negatives (FN) for the object classes. The [background, background] cell is not typically populated because the model's primary task is not to classify background regions but to detect objects.

Practical Implications

While the [background, background] cell is 0, this does not imply that the model is failing to recognize background regions correctly. Instead, it means that the confusion matrix is not explicitly tracking these true negatives. Your model's good performance on TN images, as you mentioned, indicates that it is effectively distinguishing between objects and background in practice.

Further Steps

If you want to delve deeper into the model's performance on background regions, you might consider additional metrics or custom evaluation scripts that explicitly track true negatives. However, for most object detection tasks, focusing on TP, FP, and FN provides sufficient insight into model performance.

Feel free to reach out if you have any more questions or need further assistance. Keep up the great work! 🚀

Lhhiep-maxcode commented 2 months ago

@glenn-jocher awesome, thank you for your explicit explanation

glenn-jocher commented 2 months ago

Hi @Lhhiep-maxcode,

You're very welcome! I'm glad the explanation was helpful to you. 😊

If you encounter any further issues or have more questions as you continue working with YOLOv5, please don't hesitate to reach out. We are always here to help and ensure you get the best performance out of your models.

In the meantime, if you haven't already, you might find our YOLOv5 documentation useful for additional insights and best practices. It covers a wide range of topics from training to deployment.

Happy coding and best of luck with your projects! 🚀

ultralytics / yolov5