beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.57k stars 191 forks source link

Fine-tuned Cross-encoder scores are lower than ms-marco zero-shot scores ? #156

Open cramraj8 opened 1 year ago

cramraj8 commented 1 year ago

Hi @thakur-nandan, @nreimers

I am fine-tuning the either cross-encoder/ms-marco-electra-base or cross-encoder/ms-marco-MiniLM-L-12-v2 models on other IR collections (tree-covid or NQ). But the fine-tuned model scores are lower than zero-shot scores. I wonder if there's a domain shift in custom datasets or am I doing the training wrong ? I am using sentence-transformer cross-encoder APIs for training.

Since these pre-trained models trained in certain settings (hyper-parameters and model architecture with loss), are these models sensitive to those settings during fine-tuning as well ?

root-goksenin commented 6 months ago

Hey! Could you give the code, so we can further help you diagnose the issue?