texttron / tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.
http://tevatron.ai
Apache License 2.0
530 stars 100 forks source link

updated eval_beir.sh based on new scripts and arguments #147

Closed srikanthmalla closed 3 months ago

srikanthmalla commented 3 months ago

Previous shell script for evaluation of beir is copy and paste of tevatron 1.0. It does not work because:

  1. the scripts are changed in the main branch
  2. the arguments names are changed too

Updated script works with the main branch script for evaluation as it is.

Current example usage of the updated script:

./eval_beir.sh --lora_name_path /retriever-mistral/checkpoint-7600 \
                    --dataset arguana \
                    --tokenizer mistralai/Mistral-7B-v0.1 \
                    --model_name_path mistralai/Mistral-7B-v0.1 \
                    --embedding_dir beir_embedding_arguana \
                    --query_prefix "Query: " \
                    --passage_prefix "Passage: "
srikanthmalla commented 3 months ago

The previous/first commit always require lora_name_path arg to be passed. I made it more generic to have the users pass lora_name_path to use lora, if not don't use lora for evaluation.

./eval_beir.sh --dataset arguana \
                    --tokenizer mistralai/Mistral-7B-v0.1 \
                    --model_name_path mistralai/Mistral-7B-v0.1 \
                    --embedding_dir beir_embedding_arguana \
                    --query_prefix "Query: " \
                    --passage_prefix "Passage: " \
                    [--lora_name_path /retriever-mistral/checkpoint-7600]

The one in [ xxx ] is optional to pass or not, still the evaluation script works fine.

MXueguang commented 3 months ago

thanks for the PR, will merge once with match the arguana score

srikanthmalla commented 3 months ago

It works! The scores match. Thanks for your suggestion! Let me also add normalize as an optional argument to the eval_beir.sh before merge.

srikanthmalla commented 3 months ago

Done! added normalize as optional argument and tested. Should get this result

recall_100              all 0.9936
ndcg_cut_10             all 0.6204

for this command:

./eval_beir.sh      --dataset arguana \
                    --tokenizer intfloat/e5-mistral-7b-instruct \
                    --model_name_path intfloat/e5-mistral-7b-instruct \
                    --embedding_dir beir_embedding_arguana_e5_mistral \
                    --query_prefix "Instruct: Given a claim, find documents that refute the claim\nQuery:"\
                    --passage_prefix ""\
                    --normalize

Please feel free to merge now.

MXueguang commented 3 months ago

thank you for the pull request!