Open mateuszwalo opened 4 months ago
👋 Hello @mateuszwalo, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Pip install the ultralytics
package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
The recall and precision is based on IoU threshold that produces the best F1-score. It won't correspond with the generated confusion matrix because that uses conf threshold of 0.25 and iou threshold of 0.45.
Search before asking
YOLOv8 Component
Hyperparameter Tuning
Bug
I discovered an error during the hyperparameter tuning process. The metrics reported during this phase are calculated incorrectly. Here are several screenshots and code snippets that demonstrate this issue:
Tuner: 9/100 iterations complete ✅ (2757.64s) Tuner: Results saved to runs/detect/tune5 Tuner: Best fitness=0.93986 observed at iteration 6 Tuner: Best fitness metrics are {'metrics/precision(B)': 0.98181, 'metrics/recall(B)': 0.96721, 'metrics/mAP50(B)': 0.98822, 'metrics/mAP50-95(B)': 0.93448, 'val/box_loss': 0.32963, 'val/cls_loss': 0.25273, 'val/dfl_loss': 1.00118, 'fitness': 0.93986} Tuner: Best fitness model is runs/detect/train20 Tuner: Best fitness hyperparameters are printed below.
Printing 'runs/detect/tune5/best_hyperparameters.yaml'
lr0: 0.00712 lrf: 0.00788 momentum: 0.87901 weight_decay: 0.0004 warmup_epochs: 2.15061 warmup_momentum: 0.53286 box: 6.03385 cls: 0.42 dfl: 1.88721 hsv_h: 0.01581 hsv_s: 0.52275 hsv_v: 0.46993 degrees: 0.0 translate: 0.07884 scale: 0.39232 shear: 0.0 perspective: 0.0 flipud: 0.0 fliplr: 0.57013 mosaic: 0.85197 mixup: 0.0 copy_paste: 0.0
After opening the contents of the path
runs/detect/train20
to view the confusion matrix for the objects predicted by the model, I obtained the following:Based on our knowledge from data exploration, we want to calculate Recall and Precision where
After substituting the confusion matrix results into the formulas, we obtain the following results:
Precision = 0,983 Recall = 0,959
These results differ from the ones listed during tuning: 'precision(B)': 0.98181, 'recall(B)': 0.96721, which can be misleading. This bug occurs every time I perform hyperparameter tuning and affects only the metrics calculated using the confusion matrix. Here is the example for the best fitness model. Other models with different parameters show even greater differences between the metrics reported by YOLO and the actual results.
Environment
No response
Minimal Reproducible Example
model=YOLO(f"{HOME}/runs/detect/train/weights/best.pt") data=f"{dataset.location}/data.yaml" model.tune(data=data, epochs=10, iterations=100, optimizer="AdamW", plots=False, save=False, val=False)
Additional
No response
Are you willing to submit a PR?