Closed junhyukso closed 3 years ago
Hi We can not access the validation data during searching and training, since this will cause the data leak issue. Actually, for such a big dataset, the over-fitting problem is very weak. We have verified this by experiments.
Thanks for your great research and code. I'm just curious about why you use sub-validation set as small amount of training data, instead of same amout of validation data. is it unlikely that the model will be over-fitting to the training data and will be measured with high accuracy?