nyu-dl / dl4marco-bert

BSD 3-Clause "New" or "Revised" License
476 stars 87 forks source link

Positive/Negative Ratio #17

Closed ASoleimaniB closed 5 years ago

ASoleimaniB commented 5 years ago

I didn't mange to find explicitly positive to negative ratio in your paper and code. Could you please say for each relevant (positive) pair how many irrelevant (negative) samples you have in training set?

rodrigonogueira4 commented 5 years ago

The training set comes from this file: https://msmarco.blob.core.windows.net/msmarcoranking/triples.train.small.tar.gz,

It contains ~400k positive and ~40M negatives examples, so the positive to negative ratio is 1 to 100.