texttron / tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.
http://tevatron.ai
Apache License 2.0
435 stars 87 forks source link

About hard negative mining on NQ #107

Closed x-zb closed 3 months ago

x-zb commented 4 months ago

Hi :),

Thank you for your project~ I'm wondering if the code to generate the self-mined hard negatives for NQ has been released? and what hyperparameters do you use to generate them, such as the search depth k, and whether all the positive passages are excluded or only the first one?

In hn.json it seems you have 30 hard negatives for each question. Could you share how do you get them? Because we found that the pool of hard negatives has a huge impact on the final performance, and we'd like to generalize this to other datasets.

Thanks in advance.

chenzhongwu commented 2 months ago

Hi! Have you solved your problem: How to get hard negatives for each question? Thanks in advance.