embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.74k stars 231 forks source link

Add Prompt Retrieval #95

Open Muennighoff opened 1 year ago

Muennighoff commented 1 year ago

We could add the prompt retrieval benchmark: https://arxiv.org/abs/2209.01975

dipam7 commented 4 months ago

You mean add it as a task, right? along with all the datasets mentioned in the paper?

KennethEnevoldsen commented 4 months ago

As a benchmark I suspect, including all its datasets.

Muennighoff commented 4 months ago

Also cc'ing @hongjin-su here who also knows mteb quite well & may be interested in adding / helping add this

hongjin-su commented 4 months ago

Sure, I could add this!

hongjin-su commented 4 months ago

The performance for prompt retrieval is measured by LLM results in downstream tasks. Back then, the paper used GPT-J. Should we switch to a more up-to-date model, e.g., Llama3-8B or mistral-7B?

Muennighoff commented 4 months ago

The performance for prompt retrieval is measured by LLM results in downstream tasks. Back then, the paper used GPT-J. Should we switch to a more up-to-date model, e.g., Llama3-8B or mistral-7B?

If those newer models mean better evaluation results, then probably a good idea to switch!

hongjin-su commented 3 months ago

I create a pr to include 10 tasks for prompt retrieval. Feel free to check it out!

KennethEnevoldsen commented 2 months ago

Just to let others contributors know that the PR (#608) was never merged. We would still welcome this submission.