embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.61k stars 211 forks source link

calculate_metadata_metrics on MSMARCOv2 goes OOM #992

Open isaac-chung opened 4 days ago

isaac-chung commented 4 days ago

Arose from https://github.com/embeddings-benchmark/mteb/pull/988.

Note: MSMARCOv2 has 138 million passages, 3 eval splits, and 1 language.