beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.49k stars 177 forks source link

Evaluate "Offline" Models #167

Open arendu opened 4 months ago

arendu commented 4 months ago

Currently, beir evaluates a model by calling its encode_queries and encoder_corpus functions. These in turn call the forward methods of the model.

This works great for pytorch-based models, but not for models from other deep-learning frameworks.

To circumvent this issue, I've introduced a "general" offline model which is a static numpy array of query and corpus representations along with a mapping of desired corpus-ids for each query-id.

Thus, any model from any framework can dump its representations for queries and corpus in a numpy format which can then be evaluated using this PR.