castorini / rank_llm

RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
http://rankllm.ai
Apache License 2.0
365 stars 43 forks source link

Added Support for Bge-Reranker-v2 into RankLLM #132

Open Yuv-sue1005 opened 3 months ago

Yuv-sue1005 commented 3 months ago

Pull Request Checklist

Reference Issue

ref: N/A

Checklist Items

Before submitting your pull request, please review these items:

PR Type

Documentation

Dependencies

Aside from rank_llm's general setup, install the following:

pip install -U FlagEmbedding

Running bge

We can run bge with a simple command in the rank_llm directory as follows:

# if you want to remove progress bars, pass env var TQDM_DISABLE=1
python src/rank_llm/scripts/run_rank_llm.py --model_path=insert_model_name_on_hf --dataset=insert_dataset_path_or_name --retrieval_method=_insert_retrieval_method --prompt_mode=bge-reranker-v2 --batch_size=insert_batch_size --context_size=Insert_context_size

Tests

Here are some tests on the Deep Learning 2019 dataset, just to make sure things work.

# base
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-base --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2

# large
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-large --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2

# m3
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-m3 --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2

# gemma
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-gemma --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2

# minicpm-layerwise
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-minicpm-layerwise --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2