Cross Validation - Githubissues

nyu-dl / dl4marco-bert

BSD 3-Clause "New" or "Revised" License

476 stars 87 forks source link

Hi, I read the training paragraph in your paper, I find that there is no description about cross validation in it. Because my dataset is not large as msmarco, I care about this problem. I process my dataset as this way: I have a human-judged file, which list which docid is relevant, which is not. Then I use these labellbed docid to generate the triple file, namely query, positive_doc, negative_doc. And I also have an initial ranked list, which lists top n docids for each query. I use this initial ranked list to get a dev.tsv file for prediction phase. Should I need to cross validation during the training phase? And how to modify the training code? Or Is it right that the way I do?

nyu-dl / dl4marco-bert

Cross Validation #31