Open asusdisciple opened 6 days ago
👋 Hello @asusdisciple, thank you for your interest in Ultralytics 🚀! We appreciate your thoughtful question! The ability to compare models using standardized metrics is definitely important for robust evaluations.
We recommend checking out the Docs for details on evaluation metrics and examples for Python and CLI usage. This includes how mAP and other metrics are calculated to give you an understanding of the evaluation processes.
For your particular inquiry, if you are exploring deeper into metrics like mAP@0.5
and specific methodologies, and how bounding box selections are performed, some of the core logic is integrated into the model's validation/evaluation pipelines. While it does not currently have a direct callable function like metrics.map50(list_of_boxes, gt_boxes)
out-of-the-box, you can explore the code under the val.py
script, where the evaluation logic resides.
If you need further guidance:
As always, ensure you are on the latest version of ultralytics
in a compatible environment to avoid any outdated issues by running:
pip install -U ultralytics
YOLO models can be tested in any of the following verified environments:
You might also want to explore ways to adapt and test different models or outputs to YOLO standards for accurate evaluations.
For real-time discussions, join our Discord community 🎧. Alternatively, engage on our Discourse forum or Subreddit for in-depth model comparisons and evaluations.
An Ultralytics engineer will review your issue and provide additional assistance soon 🙂
@asusdisciple thank you for your kind words and question! For metric implementation details, we recommend reviewing our validation code in the BaseValidator class (source) and metrics calculations in metrics.py. To compare models fairly:
Our metrics follow standard implementations but with optimized PyTorch backend. For exact implementation details, please refer to the linked source code sections. For commercial use comparisons, ensure compliance with our licensing terms at https://ultralytics.com/license.
You can use this:
from ultralytics import YOLO, ASSETS
model = YOLO("yolo11n.pt")
results = model(ASSETS / "bus.jpg")
from ultralytics.utils.metrics import DetMetrics
from ultralytics.models.yolo.detect.val import DetectionValidator
metrics = DetMetrics()
val = DetectionValidator()
metrics.names = model.names # should be the dictionary of classes
num_images = 1
for i in range(num_images):
# Process each images prediction and GT separately.
# We use the prediction from model. But it can be your saved predictions. Shape [N, 6]. Type: Tensor.
boxes = results[0].boxes.data.cpu()
pred_conf = boxes[:, 4]
pred_cls = boxes[:, 5]
# Should be you ground truth. Using prediction as ground truth as example.
gt_cls = pred_cls # Shape [N, 1]
gt_boxes = boxes[:, :4] # Shape [N, 4]
tp = val._process_batch(boxes, gt_boxes, gt_cls).int()
metrics.process(tp, pred_conf, pred_cls, gt_cls)
# This will print all the metrics
print(metrics.results_dict)
Search before asking
Question
First of all thanks for providing this great library and making all of it open source, guys like you really bring the ML community forward!
My problem: I trained a lot of yolov11 models on different tasks and it works flawlessly so far, however for the sake of benchmarking I want to compare my yolo models to my old legacy models. Metrics for evaluation (for example mAP) are well defined in general but can have subtle implementation deviations between different repositories.
Thats why I would like to use the ultralytics metrics to evaluate both models, but its hard to find the code where this is done. For example: The implementation for IOU of bounding boxes can be easily found, but I cant find the logic for map50 or how how you decide which boxes are selectd (for example taking the max confidence, or max IOU or whatever when two boxes are very close to the GroundTruth).
Since my legacy model just provides a list of bbox, ideally I would like to call something like metrics.map50(list_of_boxes, gt_boxes) if thats possible.
I hope somebody can point me towards the right direction.
Additional
No response