Closed guotong1988 closed 4 years ago
A triple in doc ranking problem are (query, positive_doc, negative_doc)
If I already have query-positive_doc pairs data, how do I prepare the negative doc data?
Random is the baseline policy.
With human in loop, I can use BM25 for the candidate docs and then label each of them.
I prefer do it without human.
Thank you very much.
Random is the usual case though it will produce weak negative docs. If you already have the positive docs, you can use existing models to retrieve the docs not in the positive set as the negative doc.
A triple in doc ranking problem are (query, positive_doc, negative_doc)
If I already have query-positive_doc pairs data, how do I prepare the negative doc data?
Random is the baseline policy.
With human in loop, I can use BM25 for the candidate docs and then label each of them.
I prefer do it without human.
Thank you very much.