Open swj0419 opened 1 week ago
Thanks @swj0419 !
I saw many benchmarks have separate task files for each subtask, e.g., CQA in this page. Conceptually, do you think we should maintain one BRIGHT page or multiple ones?
Also, would you suggest that we separate standard and long settings to two files, given that the main_score is different? We use ndcg_at_10 for the standard setting, and recall_at_1 for the long setting.
Checklist for adding MMTEB dataset
Reason for dataset addition: (Bright dataset, https://huggingface.co/datasets/xlangai/BRIGHT)
mteb
package.[x] I have run the following models on the task (adding the results to the pr). These can be run using the
mteb run -m {model_name} -t {task_name}
command.sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
intfloat/multilingual-e5-small
make test
. (NOTE: make test didn't work for me with error (pytest: error: unrecognized arguments: -n), butpytest --durations=5
passes)make lint
.