Closed psyhtest closed 1 year ago
From v3.0, if a submitter provides any results with any models trained on a pre-approved dataset, the submitter must also provide at least one result with the corresponding Closed model trained (or finetuned) on the same pre-approved dataset, and instructions to reproduce the training (or finetuning) process.
I recall we introduced this specifically for RetinaNet just before introducing it in v2.1. At the time, the RetinaNet dataset, an MLPerf subset of OpenImages, was only used for benchmarking one and only one model, namely the MLPerf variant of RetinaNet. Therefore, we would miss out on objectively benchmarking other research Object Detection models, typically trained and validated on the COCO dataset. The idea was that a potential submitter would finetune RetinaNet on COCO too and thus provide a useful baseline figure for any comparisons on the alternative dataset.
We at KRAI actually did this for v2.1, measuring mAP=35.293% and publishing the finetuned model. This accuracy is lower than that of the reference model on OpenImages (mAP=37.55%), but much higher than, say, that of the deprecated SSD-ResNet34 model (mAP=20.00%). So a submitter showcasing their highly optimized SSD-ResNet34 implementation could legitimately claim that it is faster than RetinaNet, albeit less accurate.
This is not fool-proof, however. A submitter could spend minimal effort on finetuning (or not at all), presenting, for example, that RetinaNet achieves only mAP=10% on the COCO dataset. Then they could misleadingly claim that their optimized SSD-ResNet34 implementation is both faster than RetinaNet and more accurate.
When seeking such pre-approval, it is recommended that a potential submitter convincingly demonstrates the accuracy of the corresponding Closed model on the same validation dataset, which may involve retraining or finetuning the Closed model if required.
This is intended to avoid the above situation. At least, such a submitter would face scrutiny from the WG in the pre-approval stage :). They may get away with handwaving it through though :).
@psyhtest This is perfect. Covers everything we wanted to change. LGTM.
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅