beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.55k stars 187 forks source link

init_weights error when loading DPR models from HF modelhub #17

Closed Timoeller closed 3 years ago

Timoeller commented 3 years ago

Congrats on this very well structured, documented and helpful framework for figuring out whats going on in IR - especially on OOD data. Keep up the good work!

When loading DPR models from HF modelhub like:

model = DRES(models.SentenceBERT((
    "facebook/dpr-question_encoder-multiset-base",
    "facebook/dpr-ctx_encoder-multiset-base",
    " [SEP] "), batch_size=128))

I run into an NotImplementedError: Make sure_init_weigthsis implemented for <class 'transformers.models.dpr.modeling_dpr.DPRQuestionEncoder'>

I know you converted the model to a sentencetransformers already and can be loaded like this but an interoperability with the HF hub would be slick - also for other DPR models in other languages like French or German.

Thanks

thakur-nandan commented 3 years ago

Hi @Timoeller,

Added support to evaluate HF DPR models, you can load using this snippet below:

model = DRES(models.DPR((
    "facebook/dpr-question_encoder-multiset-base",
    "facebook/dpr-ctx_encoder-multiset-base"), batch_size=128))

Also works with deepset.ai GermanDPR model,

model = DRES(models.DPR((
    "deepset/gbert-base-germandpr-question_encoder",
    "deepset/gbert-base-germandpr-ctx_encoder"), batch_size=128))

I would push the updates to the latest pip version hopefully soon. Until then you can download the repository and run it locally if required.

Kind Regards, Nandan

Timoeller commented 3 years ago

Thanks for the fix, tested on current master and it works.