qdrant / fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding
https://qdrant.github.io/fastembed/
Apache License 2.0
1.51k stars 110 forks source link

Add nDCG/Recall vs Speed Comparison for All Supported Models #17

Open NirantK opened 1 year ago

NirantK commented 1 year ago

Something which can answer the same questions which are answered here i.e. :

Which models offer the best recall to speed trade off for query time across different domains?

NirantK commented 1 year ago

Preference order of supported models:

  1. BGE-Small
  2. BGE-Base
  3. Sentence Transformer Family
prrao87 commented 1 year ago

Which datasets are you hoping to cover for the bench? Would be interested to contribute!

NirantK commented 1 year ago

Something which is permissively licensed from BEIR: https://github.com/beir-cellar/beir/tree/main/examples/dataset

On Wed, 18 Oct 2023 at 17:36, Prashanth Rao @.***> wrote:

Which datasets are you hoping to cover for the bench? Would be interested to contribute!

— Reply to this email directly, view it on GitHub https://github.com/qdrant/fastembed/issues/17#issuecomment-1768309248, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYZUPI5YFVETU3FODZODH3X77A3RAVCNFSM6AAAAAA5SPY4DOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRYGMYDSMRUHA . You are receiving this because you authored the thread.Message ID: @.***>