DPWXY / BIRCO

Other
6 stars 1 forks source link

Integrate with MTEB? #1

Open Muennighoff opened 4 months ago

Muennighoff commented 4 months ago

Cool work! It'd be great to have it integrated in MTEB (https://github.com/embeddings-benchmark/mteb) if you're interested :)

BIRCO-benchmark commented 4 months ago

Hi, thanks for reaching out. We'd be happy to have it included in MTEB. For more details you can refer to our paper, https://arxiv.org/pdf/2402.14151

Two things to note. First, each of the task included in BIRCO has a unique objective (i.e. instruction), we give out a reference version of the task-specific complex objective, in our paper's appendix B1-B5. These objectives are crucial to instruct LLM and embedding models to understand the task objective, without having to fine-tune the model.

Secondly, our dataset DORIS-MAE is scientific query passage reranking dataset. For a given query, each paper abstract in the candidate pool receives a non-binary, continuous score between 0-2. Typically, we use 1 as the cutoff for determining relevance. See our previous paper https://neurips.cc/virtual/2023/poster/73559 for more clarification.

We are also happy to provide more clarifications and assistance, you can contact us from our email listed in our papers.

Muennighoff commented 4 months ago

Great thanks so much for the info! In case you have bandwidth to open a PR for integration, that'd be amazing else maybe someone in the community might tackle it so I've opened an issue: https://github.com/embeddings-benchmark/mteb/issues/818 😊