roboflow / roboflow-100-benchmark

Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets
https://www.rf100.org
MIT License
248 stars 24 forks source link

Which set (val vs test) was used for the benchmark ? #46

Closed Louis-Dupont closed 1 year ago

Louis-Dupont commented 1 year ago

Hi,

I have a question regarding which set is used in the reported results. I went through the scripts and:

It seems like yolov5 is evaluated on test set https://github.com/roboflow/roboflow-100-benchmark/blob/65ca4bf3b4b9211fb04a9520a9ff6037c9de0fde/yolov5-benchmark/train.sh#L33-L42

But I don't see anything similar on yolov7, so it seems it is evaluated on val set https://github.com/roboflow/roboflow-100-benchmark/blob/65ca4bf3b4b9211fb04a9520a9ff6037c9de0fde/yolov7-benchmark/train.sh#L31-L33

I also had a look at yolov8 branch and similarly to yolov7, it seems to be evaluated on val set. https://github.com/roboflow/roboflow-100-benchmark/blob/8587f81ef282d529fe5707c0eede74fe91d472d0/yolov8-benchmark/train.sh#L35-L37

Question

In what set were yolov5, yolov7, and yolov8 evaluated to get the reported results? I want to use it for a benchmark so I need to make sure that I am comparing apples to apples

image

https://github.com/roboflow/roboflow-100-benchmark/blob/8587f81ef282d529fe5707c0eede74fe91d472d0/metadata/datasets_stats.csv

Thanks a lot in advance!

Jacobsolawetz commented 1 year ago

Hello @Louis-Dupont! The models evaluated in that chart and on our arxiv paper are all evaluated on the RF100 validation sets