Closed ASoleimaniB closed 5 years ago
The training set comes from this file: https://msmarco.blob.core.windows.net/msmarcoranking/triples.train.small.tar.gz,
It contains ~400k positive and ~40M negatives examples, so the positive to negative ratio is 1 to 100.
I didn't mange to find explicitly positive to negative ratio in your paper and code. Could you please say for each relevant (positive) pair how many irrelevant (negative) samples you have in training set?