WongKinYiu / yolor

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
GNU General Public License v3.0
1.98k stars 524 forks source link

YOLOR Precision? #78

Open augmentedstartups opened 2 years ago

augmentedstartups commented 2 years ago

I had a question from a student of mine "YOLOR contain different architectures in the github (YOLOR-P6, YOLOR-W6, YOLOR-D6 AND YOLOR-CSP AND YOLOR-CSP-X). What are the differences between these architectures? Are they use for small object detection?

The repository contains YOLOV4 architectures. Why is the objective? Ensemble models?

Using the training colab, I compare my results with other models like YOLOV5, YOLOV4 and YOLOV3, and I observe better recall (+10% than YOLOV4-P5), low precision (-10% than YOLOV3) and lower MAP.5 (~2% than YOLOV5l). In the paper only talk about the AP, but metrics like precision are not good. These could be a good argument to refuse this algorithm?"

WongKinYiu commented 2 years ago

YOLOR-CSP and YOLOR-CSP-X are for lower input resolution. YOLOR-P6, YOLOR-W6, YOLOR-W6, YOLOR-D6 are for higher input resolution. And they are designed for real-time applications on different divices.

They are baseline models for comparison.

I guess you use the metric code in each repository, they may different from each other. You could check test.py and metrics.py.