ultralytics / ultralytics

Ultralytics YOLO11 🚀
https://docs.ultralytics.com
GNU Affero General Public License v3.0
31.46k stars 6.05k forks source link

How is confidence calculated by YOLOv8? #4149

Closed harmindersinghnijjar closed 1 year ago

harmindersinghnijjar commented 1 year ago

Search before asking

Question

Hello,

I may have a incorrect conceptual understanding of confidence as referenced by YOLO models so I'd like better understand confidence and also ensure I understand so correctly.

YOLO models have two types of confidences, box confidence and class confidence. Box confidence which is the probability of a bounding box containing an object. This is calculated by:

Confidence Score = Pr(Object) * IoU(pred, truth)

Then there is class confidence which is the likely hood that a detected object belongs to a particular class. This is calculated by: C_i = Pr(Class_i|Object) Pr(Object) IoU(pred, truth)

In YOLOv8, both of the equations are pre-multiplied and the resulting output is the confidence score.

Is this an accurate high-level understanding of confidence for YOLOv8? Please let me know if I'm incorrectly understanding something.

Also, is the outputted confidence score what is on the X-axis of most metric curves such as Precision-Confidence Curve, Recall-Confidence Curve and F1-Confidence Curve?

Thank you!

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @harmindersinghnijjar, thank you for your interest in YOLOv8 🚀! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.7 environment with PyTorch>=1.7.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 1 year ago

@harmindersinghnijjar hello,

Your conceptual understanding of confidence in YOLO models, and specifically YOLOv8, is fundamentally correct. Indeed, YOLO models define two types of confidence: box confidence and class confidence.

Box confidence is a measure of how certain the model is that a bounding box contains an object of interest. It combines the objectness score (the model's certainty that the box contains an object at all) with the Intersection over Union (IoU) between the predicted bounding box and the ground truth.

Class confidence, on the other hand, is an expression of how certain the model is that the detected object belongs to a particular class. This calculation involves taking the conditional probability of the class given that an object has been detected (Pr(Class_i|Object)), multiplying that with the objectness score and the IoU.

The confidence score that YOLOv8 outputs is a combination of these two confidences, which enables it to balance between how certain it is that a box contains an object and how certain it is about which class this object belongs to.

As for your question regarding metric curves, you're correct that the confidence score is often used as the X-axis. These curves allow us to see how precision, recall, F1 score, etc., change as we adjust the confidence threshold.

You are on point with these concepts. Good going & hope this helps! Let us know if you have any other questions.

BossCrab-jyj commented 1 year ago

@glenn-jocher hello, Can you tell me in which file is the code to merge the two confidences? I want to modify the ratio between them.Because I found in yolov5 that the result of (cls score * box score) will lead to a lot of missed detections,
I pay more attention to whether it can be detected, rather than whether the classification is correct, so I increased the weightof the box score.But I didn't find where is the code for score merging in yolov8.

glenn-jocher commented 1 year ago

@yingjie-jiang hello,

The merging of the class confidence and box confidence isn't done in any single, explicit location within the YOLOv8 codebase. The confidence scores are part and parcel of the loss function computation that guides the training process, and therefore the final model output.

If you are looking to adjust the balance between the box confidence and class confidence, you might want to look into adjusting the associated weights in the loss function, which indirectly influences the contribution of these confidences to the final result.

In YOLOv8, the loss function is implemented as part of the forward method in the various YOLO layer classes. These weights are part of the hyperparameters for the model training.

Please bear in mind that altering these weights might affect the overall object detection performance of the model and may require a re-evaluation of your validation set to ensure the model is still meeting your detection or classification performance benchmarks.

I hope this is helpful, and don't hesitate to reach out if you have any more queries.

BossCrab-jyj commented 1 year ago

@glenn-jocher Appreciate,Do you mean that in yolov8 I can only adjust the values ​​of the three hyperparameters box, cls and dfl in the configuration file ? I have seen related issues before and mentioned that the values ​​​​of these three parameters are obtained according to the evolution of the coco data set. If I am more concerned about whether the target can be detected, do I just need to increase the value of the box?

glenn-jocher commented 1 year ago

@BossCrab-jyj, yes, you're on the right track. YOLOv8 allows you to tune the hyperparameters named 'box', 'cls', and 'dfl' to better fit the specifics of your problem.

If your primary concern is detecting the objects, irrespective of their classes, it would be appropriate to increase the weight of the 'box' loss in your configuration file. This would make the model pay more attention to the task of correctly locating the objects, which seems to be your objective.

Be mindful though, tuning these hyperparameters may adjust the balance between precision and recall in your model. Higher weight on 'box' might improve recall (i.e., reduce missed detections), but could potentially decrease precision (i.e., increase false positives). As always, when making such changes, ensure to monitor your model's performance to validate its effectiveness.

Remember, while these parameters were evolved on the COCO dataset, your specific dataset or use case might necessitate different optimal parameters. I hope this clarifies your question and aids in your model's performance optimization. Keep us posted on your progress, and don't hesitate to ask if you have more questions.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

Tschowtschow commented 11 months ago

How does this work for predictions outside of validation? Is the returned confidence then the product of the class confidence and the box confidencen without regarding the iou? Because there can't be a iou of any ground truth when predicting an image that has not been labeled, right?

glenn-jocher commented 11 months ago

@Tschowtschow hello,

You are absolutely correct. During prediction on new unlabeled data, the Intersection over Union (IoU) with ground truth cannot be calculated as there is no ground truth available. In this case, the confidence score essentially becomes the product of the box confidence and the class confidence.

The box confidence represents how certain the model is that it has correctly placed a bounding box around an object, while the class confidence signifies the model's estimation of the object belonging to a specific class.

So, to your point, without a labeled ground truth available, these two confidences become the crucial factors determining the overall confidence score of the detection.

Thanks for your excellent question, and do reach out if you have any further questions or aspects you'd like to discuss.

flamemingo commented 10 months ago

Hi @glenn-jocher, thanks for your wonderful work with YOLO. As you mentioned during inference the way YOLOv8 calculates confidence is:

  1. Box confidence
  2. Class confidence

Are there any articles/papers that may delve deeper into the details for these calculations, specifically for YOLOv8?

Thanks for the wonderful work once again, cheers!

glenn-jocher commented 10 months ago

Hello @flamemingo,

Thank you for your kind words and interest in the inner workings of YOLOv8!

For an in-depth understanding of the confidence calculations and the mechanics of YOLOv8, you can refer to the research papers and technical reports that it builds upon. While I cannot point to a specific resource for YOLOv8 at this time, reading the original YOLO papers and their successors will provide you with a comprehensive overview of the methodologies employed.

YOLOv8 is an evolution of the ideas presented in these works and the technical details on our current implementation and improvements are documented in our repository and the official Ultralytics documentation at https://docs.ultralytics.com. I encourage you to explore those resources for detailed explanations and insights.

Thanks again for your support, and happy reading!

gabrie1-s commented 5 months ago

Hello @glenn-jocher, thanks for the clarifying explanations. Actually, I still didn't get one thing: the probabilities for the classes are given by an array of length C, where C is the number of classes. So, the product Pr(Class_i|Object) * Pr(Object) should result in an array of length C, right? So, why is just one scalar returned as confidence? Is it the product result for the most probable class?

glenn-jocher commented 5 months ago

Hello @gabrie1-s,

Great question! Yes, you're correct. The array of class probabilities (length C) is combined with the objectness score (Pr(Object)) to produce class-specific confidence scores. The final confidence score returned is indeed the highest value from this array, corresponding to the most probable class. This ensures that the detection confidence reflects the most likely class for the detected object. 😊

If you have any more questions, feel free to ask!