Open XY2323819551 opened 2 years ago
Hi sorry about this, you need this file for training. train_query_passage_pair.tsv for the --pair-file
arg.
/home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/qrels.train.tsv
I tried the new pair file but failed. I noticed that the "queries.train.tsv" for the "--query-file" arg I used has 808731 examples, however, "train_query_passage_pair.tsv" for the "--pair-file" has 532751 examples, which is less than "queries.train.tsv". I guess this issue was caused by the mismatches between two files. So, is it convenient for you to provide me the file with the "--query-file " arg? Thank you very much!
Hi sorry about this, you need this file for training. train_query_passage_pair.tsv for the
--pair-file
arg.
I had this problem before in this issue, I mistakenly thought I found the correct file, but it seems I didn't.
For the --query-file
arg, please use this file train_query_judged.tsv
Hello, thanks for your amazing work, I really want to reproduce it. However, I met an issue when I run the code, could you help me?
command line: python make_train_from_ranking.py --ranking-file /home/zhangxy/QA/ANCE-PRF/pyserini/runs/run.msmarco-passage.ance.bf.tsv --model-type ANCE --query-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/queries.train.tsv --collection-file ./data/msmarco_passage/collection/collection.tsv --pair-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/qrels.train.tsv --output data/hard/negative.result --encoder /home/zhangxy/QA/pyserini_for_ance-prf/pyserini/encoders/ance-msmarco-passage
processing: Load Query: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 808731/808731 [00:00<00:00, 1140903.16it/s] Load Collection: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 8841823/8841823 [00:16<00:00, 521248.96it/s] Load Q-D Pair: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 532761/532761 [00:00<00:00, 989247.88it/s] Load Ranking: 0%| | 0/808731000 [00:00<?, ?it/s] Traceback (most recent call last): File "make_train_from_ranking.py", line 94, in
rankings, topk = read_ranking(args.ranking_file, pair, args.prf_k, args.from_top)
File "make_train_from_ranking.py", line 35, in read_ranking
targets = pair[qid].keys()
KeyError: '1'