BEIR reproduction - Githubissues

ziqing-huang commented 2 years ago

Thanks for providing this!

Do you have the scripts for reproducing the results of SPAR on BEIR benchmark?

Particularly, did you tune the concatenation weight for BEIR evaluation?

ccsasuke commented 2 years ago

Hi, we tune the concat weights on MS MARCO dev set for BEIR evaluations.

ziqing-huang commented 2 years ago

Thanks! Did you release the tuned weight or do you mind sharing it? That would be helpful!

ccsasuke commented 2 years ago

I found the weights we used for BEIR. (For certain MS MARCO models, much smaller weights are needed. In the standard weight tuning process, we select the weights from 0.1 to 10. But if the best weight lies on either end of the spectrum, we'll continue the searching by lowering (or increasing) the weights by 100x. For instance, if weight=0.1 gave the best results for a model, we'll do another grid search on [0.001, 0.1].)

Tuned weights on MS MARCO:

SPAR (Contriever + BM25 Λ): 0.006 SPAR (Contriever + UniCOIL Λ): 0.0333

SPAR (GTR + BM25 Λ): 0.001 SPAR (GTR + UniCOIL Λ): 0.007

ziqing-huang commented 2 years ago

Thank you for your reply! I'm surprised that these weights are so small

facebookresearch / dpr-scale

BEIR reproduction #5