Open nirmal2k opened 2 years ago
I want to load antique-vbert-pair.p, the already fine tuned one
I want to validate a pretrained model (antique-vbert-pair.p). How do I do that?
Hi @nirmal2k -- sorry for the delay.
If you're looking to reproduce the results in Training Curricula for Open Domain Answer Re-Ranking, I recommend you train from scratch. Instructions are here. While it's possible to load the weight files into the cli-based OpenNIR pipelines, it's a bit hacky and tricky to get to work.
If instead, you're looking to conduct further experiments with the models, inspect outputs, etc. by far the easiest way to do it would be using the OpenNIR-PyTerrier integration. You can load the model like so:
import pyterrier as pt
if not pt.started():
pt.init()
import onir_pt # OpenNIR-PyTerrier integration -- part of OpenNIR
reranker_pair = onir_pt.reranker('vanilla_transformer', 'bert', weights='antique-vbert-pair.p', ranker_config={'outputs': 2}, vocab_config={'train': True})
Then you can use the model in a variety of ways. E.g., if you wanted to conduct a similar experiment on ANTIQUE to the one in the paper, you could do:
import pyterrier as pt
if not pt.started():
pt.init()
import onir_pt
from pyterrier.measures import *
# Dataset and indexing
dataset = pt.get_dataset('irds:antique/test')
indexer = pt.IterDictIndexer('./antique.terrier')
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])
# Models
bm25 = pt.BatchRetrieve(index_ref, wmodel='BM25') % 100 # BM25 with cutoff of 100
reranker_pair = onir_pt.reranker('vanilla_transformer', 'bert', weights='antique-vbert-pair.p', ranker_config={'outputs': 2}, vocab_config={'train': True})
reranker_pair_recip = onir_pt.reranker('vanilla_transformer', 'bert', weights='antique-vbert-pair_recip.p', ranker_config={'outputs': 2}, vocab_config={'train': True})
# Experiment
pt.Experiment(
[
bm25,
bm25 >> pt.text.get_text(dataset, 'text') >> reranker_pair,
bm25 >> pt.text.get_text(dataset, 'text') >> reranker_pair_recip,
],
dataset.get_topics(),
dataset.get_qrels(),
[MRR(rel=3), P(rel=3)@1]
)
Which gives the following results:
name RR(rel=3) P(rel=3)@1
0 bm25 0.506052 0.345
1 reranker_pair 0.733746 0.630
2 reranker_pair_recip 0.761444 0.670
(Curiously, a bit better than what was reported in the paper. Probably due to using a different system for the first stage retrieval.)
Hope this helps!
Thanks for that @seanmacavaney !! I was able to reproduce those results. Reranking 1000 documents gives an MRR@10-68.05. Is there a reason for the drop? Also I'd appreciate if you could provide a code snippet on how to do a forward pass with the loaded pretrained model given a query and document text
Reranking 1000 documents gives an MRR@10-68.05. Is there a reason for the drop?
I don't know definitively, but I suspect:
Judged@10
) to sus out such cases.I'd be curious to hear what you find if you get to the bottom of this!
Also I'd appreciate if you could provide a code snippet on how to do a forward pass with the loaded pretrained model given a query and document text
Here ya go!
import pandas as pd
sample_df = pd.DataFrame([
{'qid': '0', 'query': 'some query text', 'docno': '0', 'text': 'some document text'},
{'qid': '1', 'query': 'some other query text', 'docno': '1', 'text': 'some other document text'},
])
reranker_pair(sample_df)
should give:
qid query docno text score
0 0 some query text 0 some document text 8.423386
1 1 some other query text 1 some other document text 7.000756
Thanks for the code snippet @seanmacavaney !! And for the reasons as to why there is a drop in MRR, first two reasons you mentioned were on top of my head. Third point seems like a hack for given dataset. I've worked with msmarco and there isn't drop in MRR while trying to rank more documents. The results here in SBERT rerank the entire corpus of 8.8M passages to get that MRR. Maybe it's just that some relevant documents dont have assessments as you mentioned.
Thanks for the insights!!
Hi- can you clarify whether you're interested in using a different model initialisation (e.g., changing
bert-base-uncased
to something else) or using a model that's already been fully tuned for ranking?
Yes, I want to change bert-base-uncased
to my fine-tuned version of BERT, but I don't know how to achieve that.
Hi- can you clarify whether you're interested in using a different model initialisation (e.g., changing
bert-base-uncased
to something else) or using a model that's already been fully tuned for ranking?