Which set (val vs test) was used for the benchmark ?

Hi,

I have a question regarding which set is used in the reported results. I went through the scripts and:

It seems like yolov5 is evaluated on test set https://github.com/roboflow/roboflow-100-benchmark/blob/65ca4bf3b4b9211fb04a9520a9ff6037c9de0fde/yolov5-benchmark/train.sh#L33-L42

But I don't see anything similar on yolov7, so it seems it is evaluated on val set https://github.com/roboflow/roboflow-100-benchmark/blob/65ca4bf3b4b9211fb04a9520a9ff6037c9de0fde/yolov7-benchmark/train.sh#L31-L33

I also had a look at yolov8 branch and similarly to yolov7, it seems to be evaluated on val set. https://github.com/roboflow/roboflow-100-benchmark/blob/8587f81ef282d529fe5707c0eede74fe91d472d0/yolov8-benchmark/train.sh#L35-L37

Question

In what set were yolov5, yolov7, and yolov8 evaluated to get the reported results? I want to use it for a benchmark so I need to make sure that I am comparing apples to apples

https://github.com/roboflow/roboflow-100-benchmark/blob/8587f81ef282d529fe5707c0eede74fe91d472d0/metadata/datasets_stats.csv

Thanks a lot in advance!

roboflow / roboflow-100-benchmark

Which set (val vs test) was used for the benchmark ? #46

Question