Get RankVicuna working on Colab

lintool commented 9 months ago

yilinjz commented 9 months ago

Currently working on getting the following code to run on Colab:

from rank_llm.rank_vicuna import RankVicuna

rv = RankVicuna.from_pretrained_model("castorini/rank_vicuna_7b_v1")
query = 'What is the capital of the United States?'
docs = ['Carson City is the capital city of the American state of Nevada.',
     'The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.',
     'Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. ',
     'Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.'
     ]*100
results = rv.rerank(query=query, documents=docs)

Talked to @ronakice and the first step is to add the from_pretrained_model method to load the pretrained model. Will be working on this.

cc @lintool @sahel-sh

yilinjz commented 8 months ago

Updated 11/2/2023:

The rerank function has been implemented and added to RankVicuna in a recent PR (https://github.com/castorini/rank_llm/pull/24). This PR includes the following:

Adds support for inline documents/hits and sample demos by creating a wrapper Retriever around Pyserini retriever.
Adds a Reranker class to move reranking logic from run_rank_llm script to a module
Adds a retrieve and rerank module to replicate run_rank_llm's script

Demos of the new rerank function can be found at https://github.com/castorini/rank_llm/tree/main/demo. To reproduce, please follow one of the two approaches below:

If you have access to GPU on your local machine, take the following steps:
- Clone the repo (https://github.com/castorini/rank_llm)
- Create and activate a conda environment, cd to rank_llm/ and run pip install -r requirements.txt
- reproduce the demos with python demo/rerank_demo_docs.py or python demo/rerank_demo_hits.py
- alternatively, you can also try the named dataset mode (e.g. rerank with dl19), please refer to https://github.com/castorini/rank_llm/blob/main/rank_llm/run_rank_llm.py for sample runs
If you do not have access to GPU on your local machine, the reproduction can be done on Colab. Follow this notebook for the steps: https://colab.research.google.com/drive/16SJ2r5F19mNvupRNGdGke6u2zgSVbgYI#scrollTo=vhYgQ_xiiwHg

castorini / ura-projects

Get RankVicuna working on Colab #15