Closed stark-t closed 1 year ago
Just discovered these, if it helps us in any way with various formats and evaluation strategies:
"Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding box formats as in COCO, PASCAL, Imagenet, etc."
"A package to read and convert object detection databases (COCO, YOLO, PascalVOC, LabelMe, CVAT, OpenImage, ...) and evaluate them with COCO and PascalVOC metrics."
I have to check them out in more detail. I still have to go trough evaluate.py, sorry, I got myself stretched over too many things :/
I put this here as a comment to revisit.
I just run YOLOv7 (not the tiny version) on ~ 3000 images of Diptera and I used conf 0.3 and iou 0.9. Almost all images have a single insect. I realised that in some cases YOLO put 2-4 boxes for the same insect - see below an example:
In this case, all the taxa labels are correct (it is an indeed a Diptera here). All the boxes that you see are predictions with the exception of the manually added ground truth (the hazy selected box).
If I understood correctly, in such situations, we take as the "best" prediction the one with the biggest IoU from the ground truth (best fitting with the ground truth). Then the rest of the boxes are considered false positives (FP)? In a way they are candidates for the best IoU and they are not completely wrong, especially that they got the taxa label right.
The lower the confidence level is, the more such cases appear.
Follow the issues referred above.
Evaluate IoU and Accuracy metrics
Create a matching dataframe for all labels and predictions Challanges:
Calculate IoU for all label prediction pairs Challanges:
From matching labels and predictions calculate class accuracy metrics using confusion matrix