paip-2019 / challenge

30 stars 10 forks source link

Validation Set Ground Truth #28

Closed yiping-wang closed 5 years ago

yiping-wang commented 5 years ago

Thanks for answering my previous questions!

According to the challenge website

For the validation and test set the ground, however, the truth information is reserved to the challenge committee and will be used to evaluate the performance of participant's AI learning models.

So it seems we could not get the ground truth mask for the validation set. Nevertheless, I believe the purpose of the validation set is used to tune the model. If we do not have the ground truth mask for the validation set, we have to split the 50 training samples into, for example, 45 training and 5 validation, which seems unnecessary since you still have 40 other unseen samples to test our models.

Would you consider releasing the ground truth mask for the validation set?

Thanks!

hjoonjang commented 5 years ago

@yiping-wang Hi, thanks for raising this question. I believe this must be a good question at this moment.

We actually have been worried that the name of 'Validation dataset' might be misleading. The ordinary meaning of 'validation set' is a separated subset for engineering the model, exactly as you said. However, we give you 'Validation set' just for one-month leaderboard playground with a nearly unlimited number of attempts allowed, which make it possible for the participants to compare their preliminary results. In contrast, the 'Test set' is for one-time submission probably without open leaderboard interactions. The policy has been made in order to prevent possible abusing, such as tuning models to fit the test set with leaderboard scores.

We are about to write those details on the challenge page description. Anyway, though we have had long thought about what to name the 'Validation set' in our context, we couldn't find any fancier title for it. So please note that the name of the dataset we call 'Validation set' does not have the ordinary meaning, but it is for one-month leaderboard interactions with allowing numerous attempts, as opposed to the case of 'Test set' which is for a-single-time submission. That's all about why we don't give ground-truth for the 'Validation set' like for the 'Test set'.

I hope you will get the idea from this comment. Thanks again.