Cannot reproduce the result of “monot5-large-msmarco” on the Dev set of MS MARCO passage.

Hello,

I tried to reproduce the result of monot5-large on the Dev set of MS MARCO passage dataset. I followed the exact same procedure in https://github.com/castorini/pygaggle/blob/master/docs/experiments-msmarco-passage-entire.md. I downloaded the data and use the following command to run the code on A6000 GPU:

python -um pygaggle.run.evaluate_passage_ranker --split dev --method t5 --model castorini/monot5-large-msmarco --dataset data/msmarco_ans_entire --model-type t5-large --task msmarco --index-dir indexes/index-msmarco-passage-20191117-0ed488 --batch-size 8 —output-file run.monot5.ans_entire.dev.tsv

The results are:

It seems that the result is lower than that reported in the paper (MRR@10: 0.393). Is anything I did wrong when reproducing the results?

I would be grateful if you can help!

castorini / pygaggle

Cannot reproduce the result of “monot5-large-msmarco” on the Dev set of MS MARCO passage. #328