beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.55k stars 186 forks source link

Leaderboard results of USE #50

Closed ikuyamada closed 2 years ago

ikuyamada commented 2 years ago

Hi, Thank you for publicizing this great benchmark! I am interested in the retrieval performance of models that are not trained on retrieval datasets (e.g., MS MARCO). I think this benchmark already supports USE, but it is not listed on the leaderboard. Did you already test the performance of the model on this benchmark? I think it would be very helpful if the leaderboard also provides results of general sentence embedding models (e.g., USE, SimCSE).

thakur-nandan commented 2 years ago

Hi @ikuyamada,

Yes, in the previous version of the BEIR benchmark we included the USE model. I do have the numbers available and will add them to the leaderboard soon.

Do you have a list of specific models you would like to see the results of? I can evaluate these models and add their scores on the leaderboard.

Kind Regards, Nandan

ikuyamada commented 2 years ago

Hi @NThakur20,

Thanks for your prompt reply! I look forward to see the results of USE. I think people want to see the results of recent unsupervised models such as SimCSE and CT. Also, I personally see the results of BPR trained with MS MARCO :)

Best regards, Ikuya

nreimers commented 2 years ago

We will report results for them in an upcoming paper. All unsupervised methods don't perform at all on retrieval tasks, even when trained on in-domain data.

ikuyamada commented 2 years ago

@nreimers Thanks for your answer! The paper sounds really interesting. I close this issue for now.