sebastian-hofstaetter / matchmaker

Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch
https://neural-ir-explorer.ec.tuwien.ac.at/
Apache License 2.0
259 stars 30 forks source link

how should I get candidate_file #9

Closed haiahaiah closed 3 years ago

haiahaiah commented 3 years ago

Hi, I am replicating your code and I don't know how to get candidata_file. I followed your instructions:

2.Prepare the dataset for multiprocessing: Generate the validation sets (BM25 results from Anserini) via matchmaker/preprocessing/generate_validation_input_from_candidate_set.py

And I found this project castorini/anserini to get BM25 results for MS MARCO Passage Ranking. But it just shows the ranked doc_id of query rather than the result like '2 Q0 1782337 1 21.656799 Anserini' which is from matchmaker/preprocessing/generate_validation_input_from_candidate_set.py file.

So I wanna ask for your help about how I should get candidate_file. I will appreciate it if you could provide some more detailed guidance. Thanks a lot~

linzhu1967 commented 3 years ago

Could you please tell me how you solved it ?