Closed zhanghan9797 closed 1 year ago
Hi, I encode and search according to coCondenser-nq's README, but my reproduced results are :
Top5 accuracy: 0.2889 Top20 accuracy: 0.4681 Top100 accuracy: 0.6343
Here are my bash:
#encode_doc.sh OUTDIR=/data/private/xxx/dataset/nq/new MODEL_DIR=Luyu/co-condenser-wiki for s in $(seq -f "%02g" 0 19) do python -m tevatron.driver.encode \ --config_name $MODEL_DIR \ --output_dir $OUTDIR \ --model_name_or_path $MODEL_DIR \ --fp16 \ --per_device_eval_batch_size 64 \ --p_max_len 256 \ --dataset_proc_num 8 \ --dataset_name Tevatron/wikipedia-nq-corpus \ --encoded_save_path /data/xxx/dataset/nq/$s.pt \ --encode_num_shard 20 \ --passage_field_separator sep_token \ --encode_shard_index $s done #encode_query.sh python -m tevatron.driver.encode \ --output_dir=$OUTDIR \ --model_name_or_path $MODEL_DIR \ --config_name $MODEL_DIR \ --fp16 \ --per_device_eval_batch_size 64 \ --q_max_len 32 \ --dataset_proc_num 2 \ --dataset_name Tevatron/wikipedia-nq/train \ --encoded_save_path /data/private/xxx/dataset/nq/new/query_train.pt \ --encode_is_qry
#search DEPTH=200 python -m tevatron.faiss_retriever \ --query_reps /data/private/xxx/dataset/nq/new/query_test.pt \ --passage_reps /data/private/xxx/dataset/nq/new/'*.pt' \ --depth $DEPTH \ --batch_size 128 \ --save_text \ --save_ranking_to /data/private/xxx/dataset/nq/new/run.nq.test.txt python -m tevatron.utils.format.convert_result_to_trec --input run.nq.test.txt --output run.nq.test.teIn #eval python -m pyserini.eval.convert_trec_run_to_dpr_retrieval_run --topics dpr-nq-test \ --index wikipedia-dpr \ --input run.nq.test.teIn \ --output run.nq.test.json python -m pyserini.eval.evaluate_dpr_retrieval --retrieval run.nq.test.json --topk 5 20 100
How can I solve this problem?
Luyu/co-condenser-wiki is pre trained on Wikipedia but not fine tuned. Need to follow the README mentioned above to finetune it, using it as initialization.
Hi, I encode and search according to coCondenser-nq's README, but my reproduced results are :
Here are my bash:
How can I solve this problem?