ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.76k stars 16.34k forks source link

When confidence is 0, Recall is not 1 #11286

Closed China-Young7 closed 1 year ago

China-Young7 commented 1 year ago

Search before asking

Question

Why is the value of Recall not 1 when Confidence is 0 in R_Curve,? Theoretically Recall should be 1

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @China-Young7, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 1 year ago

@China-Young7 yes Recall should be 1 at zero confidence. Can you show an example Recall-Confidence curve where this is not the case?

glenn-jocher commented 1 year ago

Recall-Confidence curve looks like this for my results.

image

China-Young7 commented 1 year ago

like this. Or 0.85 in your result, what's the meaning of 0.85? Shouldn't that point be the recall value when the confidence is 0? Or maybe I have a wrong understanding of that point. Thanks, My savior. 1680431103179

China-Young7 commented 1 year ago

@glenn-jocher Is it because the model misses some targets and treats them as background? None of the anchor boxes match to these targets.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

srishti-buyume commented 1 year ago

@China-Young7 yes Recall should be 1 at zero confidence. Can you show an example Recall-Confidence curve where this is not the case?

Hello @glenn-jocher I have a similar graph where recall is 0.8 at 0.00 not 1 as you mentioned it should, what could be the reason for this?

glenn-jocher commented 1 year ago

Hello @srishti-buyume, thank you for bringing this to our attention. In theory, the recall should indeed be 1 at a confidence of 0. However, there could be several factors that contribute to a recall value of 0.8 at a confidence of 0.00.

One possibility is that the model may be missing some targets and treating them as background. This could happen if none of the anchor boxes match well with these targets. Additionally, there may be other factors such as the dataset quality, training settings, or model architecture that could affect the recall values.

To further investigate this issue, it would be helpful if you could provide more details about your specific setup, including the dataset, training parameters, and model architecture. This will enable us to better understand the potential causes and provide more specific guidance.

Thank you for your contribution to the YOLOv5 community, and we look forward to assisting you further.

srishti-buyume commented 1 year ago

Thanks for your prompt reply @glenn-jocher

To further investigate this issue, it would be helpful if you could provide more details about your specific setup, including the dataset, training parameters, and model architecture.

Here is the setup: Dataset includes 2000 images (Flickr face images) consisting around 10000 bounding box annotations for following classes:

  1. acne
  2. pimple
  3. mole 1
  4. mole 2
  5. dark spot
  6. scar

The main objects/classes that we are interested in detecting accurately is - acne, pimple, and spot

These classes have certain similar looking features based on which they are defined - for example acne is supposed to be the bumpy white-brown cystic thing and pimple is similar but there is redness around it. I am not sure if this is the right approach for data annotations of such similar looking object, will welcome any suggestion to correct this.

Now the model I am using is YOLOv8-medium object detection model with default configuration and also tried various optimizer and learning rate combination, but the recall value is never 1 at 0.0 and also there is little to no improvement in mAP at all when training with different hyperparameters. Also the validation loss for box_loss and dfl_loss increases in all cases. I even dropped all classes except acne-pimple and mAP is still 0.36. The graphs attached are similar even when experimenting with different optimizer (like Adam) and different learning rate (0.001 or 0.0001)

What is your conclusion for such model training behavior and any suggestion to improve this?

Please find attached some graphs (default YOLOv8 train config): (model config: epoch=50, batch size=8, optimizer=SGD, lr0=0.01) exp21_pr_curve

exp21_recall

exp21_results

If you need more details, please let me know?

glenn-jocher commented 1 year ago

Thanks for sharing the details of your setup and the specific issues you're facing, @srishti-buyume.

Based on the information you've provided, it seems like you're training a YOLOv8-medium model on a dataset of 2000 images with around 10000 bounding box annotations for classes such as acne, pimple, mole, dark spot, and scars. You're particularly interested in accurately detecting acne, pimple, and spots.

Regarding your concern about similar-looking classes, it's important to ensure that the annotations distinguish between the different classes clearly. In cases where classes have similar appearances, providing additional annotated visual cues (such as color variations or shape differences) could help the model differentiate them more effectively. You might consider refining your annotation process to improve class separability.

Now, concerning the training results and the observed recall values, it's important to note that achieving a recall value of 1 at 0.0 confidence is an ideal scenario and may not always be attainable. The recall value represents the fraction of ground truth objects that are correctly detected, while the confidence value refers to the model's certainty in its predictions. In practical scenarios, it's not uncommon to have some missed detections or false negatives which could result in a slightly lower recall value.

Regarding the mAP (mean Average Precision) and validation loss, it's essential to assess them together with other evaluation metrics, as they provide a more comprehensive understanding of the model's performance. It's not unusual for validation loss to increase initially and then stabilize or decrease as the model learns more during training.

To improve your model's performance, you could try experimenting with different training configurations. Consider adjusting hyperparameters such as learning rate, batch size, and training duration. Additionally, you may want to consider using other advanced optimization techniques like learning rate schedules or different optimizers to further fine-tune your model. It could also be beneficial to review the distribution and quality of your dataset to ensure it adequately represents the real-world scenarios you're targeting.

Lastly, if you're not seeing significant improvements even after trying different hyperparameters, it might be worth considering additional data augmentation techniques, such as random cropping, rotation, or brightness adjustments, to increase the diversity of your training data.

I hope these suggestions help. If you have any further questions or require more specific information, do not hesitate to ask. Good luck with your model training!

srishti-buyume commented 1 year ago

To improve your model's performance, you could try experimenting with different training configurations. Consider adjusting hyperparameters such as learning rate, batch size, and training duration. Additionally, you may want to consider using other advanced optimization techniques like learning rate schedules or different optimizers to further fine-tune your model

Hello @glenn-jocher thanks for such a detailed response, I have experimented with a lot of combinations of various hyperparameters (optimizer, learning rate, etc.), the results show very little improvement. I tried data augmentations like rotation, flip, cropping, it showed no impact on mAP or other results whatsoever. Though I have not tried training it for longer epochs because I thought it is already overfitting based on the graphs. I will train for more epochs and try different batch size. Also, I personally checked the annotated dataset that was labelled by two data annotators, and the accuracy of the annotations are like 70%, a lot of missing annotations, mis-classification are there in the dataset. So I will further work on improving the dataset quality.

glenn-jocher commented 1 year ago

Hello @srishti-buyume, thank you for providing the additional information. It's good to hear that you've already experimented with different hyperparameters and data augmentations, although you did not observe significant improvements in the results.

Considering the accuracy of the annotated dataset, it's crucial to ensure high-quality annotations for training robust and accurate models. Since you mentioned a 70% accuracy in the annotations with missing annotations and mis-classifications, improving the dataset quality would certainly be a valuable step. Reviewing and refining the annotations, addressing any missing annotations or mis-classifications, can potentially have a positive impact on the model's performance.

Regarding the model training duration, extending the number of epochs and trying different batch sizes are reasonable steps to explore. Sometimes, models require more training iterations to learn complex patterns and generalize better. Increasing the training duration might help the model improve its performance, but it's essential to monitor the training loss and evaluation metrics to identify the optimal point where further training does not lead to overfitting.

Keep in mind that achieving the best model performance often involves an iterative process of experimenting with different configurations, data augmentations, and dataset enhancements. Consider analyzing the model's predictions, looking for specific patterns or scenarios that lead to misclassifications, and accordingly adjust your training strategies.

Please feel free to share any progress or ask further questions as you continue your experimentation. We're here to assist you along the way. Good luck with your efforts to improve your model's performance!