Time profiling of RankVicunna or RankZephyr zero-shot evaluation/ inference on BEIR datasets

Hi, I wonder the time profiling of each LLMs to run across queries for re-ranking. I am running RankVicunna and RankZephyr on the zero-shot setting across BEIR datasets. For FiQA (648 queries) to conduct re-ranking of BM25 top-100 documents, RankVicunna takes ~4.5 hrs on a powerful machine (H100 GPU). The calculation leads to 30-40 seconds per query to re-rank. I wonder if this is the ideal time profiling anyone observed, or the code can be optimized with different window size or strides. Thanks in advance!

castorini / rank_llm

Time profiling of RankVicunna or RankZephyr zero-shot evaluation/ inference on BEIR datasets #108