embeddings-benchmark / arena

Code for the MTEB Arena
https://hf.co/spaces/mteb/arena
14 stars 6 forks source link

WIP: feat: Add ColBERT #37

Open bclavie opened 2 months ago

bclavie commented 2 months ago

Hey @Muennighoff!

Just the indexing code for now (will add the rest tomorrow), but opening the draft PR in case you wanted to take a look at this before the rest comes in!

Goal of the PR

Add support for ColBERT models, starting with Answer.AI's ColBERT-small via an API Answer will host (discussed with @okhat who's also okay with this being the first ColBERT representative), in order to see how multi-vector models of various sizes fare on this benchmark. The querying mechanism within the API is very simple and lives at AnswerDotAI/mteb_arena_colbert_api.

Changes