Closed laos1984 closed 3 years ago
I'm not sure why this happens.
Firstly, maybe you can try version 3.4.0 during inference.
Secondly, I suggest checking whether the generated dev-qrel.tsv
is correct. You can check it by using it to evaluate the provided STAR rank file. After downloading it, you need to convert the qids and pids to the preprocessed qoffsets and poffsets. It is a little bit tricky, but you can refer to cvt_back.py, which converts in an opposite direction (offsets to ids). Then you can run python ./msmarco_eval.py ./data/passage/preprocess/dev-qrel.tsv convt_download_dev.rank.tsv
and see whether MRR@10 is 0.340.
Happy to help you :)
No activity. Closing.
Hi Jingtao,
I try to reproduce the results showing in the README. The models are downloaded from google drive. For the transformers version, preprocessing is 2.8.0 and for inference is 4.8.2.
I ran the following commands: python ./star/inference.py --data_type passage --max_doc_length 256 --mode dev
python ./msmarco_eval.py ./data/passage/preprocess/dev-qrel.tsv ./data/passage/evaluate/star/dev.rank.tsv
And I got the following results: Eval Started ##################### MRR @10: 0.010382669304589082 QueriesRanked: 6980 #####################
Could you help to figure out what I did wrong? Thanks!