Open pablogranolabar opened 3 years ago
Hi @pablogranolabar,
This is indeed strange. Thanks for sharing these values.
msmarco-distilbert-base-v3
?Also, could you try using the ms-marco-MiniLM-L-6-v2
(https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) model? This is a stronger model compared to the ms-marco-electra-base
.
Kind Regards, Nandan Thakur
Hi @NThakur20, thanks for making your work available and for the speedy reply!
Yes this a custom dataset, a collection of search engine queries such as returning company information for a ticker.
For rerank, I used the default of 100 documents I think it is.
I will check out MiniLM next, thanks for the help!
Hi again @NThakur20, I swapped out the cross encoder with ms-marco-MiniLM-L-6-v2 but I am still getting subpar re-rank scores after the dense IR. Any thoughts?
Hi @pablogranolabar,
Could you manually evaluate the top-k documents (let's say for k=10), and check whether the results are as expected? One reason could be how the test data was annotated?
Could you share a snippet of your pseudocode to check once if everything is working as expected?
Kind Regards, Nandan Thakur
Hi @NThakur20, yes I've experimented with lower k values all the way down to 10 as well as varying batch sizes. Pretty much the same results, rerank scores are across the board lower after dense IR. The dataset is pretty small though, just about 13K search queries and their anticipated results. Think that would be a large factor with this?
And how important would hyperparameter optimization be in this scenario, I've been thinking about putting together an RL environment for that to increase precision which is low but recall and the other two scores are consistently high.
Hi @pablogranolabar, maybe try initially with Elasticsearch as the first step and further rerank top-k using the above-mentioned cross-encoder?
In our publication, we found lexical retrieval + CE rerank combination to work well.
@thakur-nandan I am experimenting BM25 + CE for TREC-NEWS, TREC-COVID, and NQ. However, for TREC-COVID I am getting lower re-ranking performance than BM25 scores after using ms-marco-MiniLM-L-6-v2
as zero-shot re-ranker. Do I have to fine-tune it again ? The table of results in your paper having column BM25+CE contains scores after fine-tuning the MiniLM or zeros-shot performance ?
I just realized after combining title + text combined multi-field text and re-ranking, I was able to reproduced the scores reported in the paper.
Hi,
I've got a dense IR pipeline running with rerank, for a search engine application. However my rerank scores are lower than just a dense IR run?
Scores:
Any thoughts would be greatly appreciated.