castorini / ura-projects

0 stars 1 forks source link

Get RankVicuna working on Colab #15

Open lintool opened 9 months ago

lintool commented 9 months ago

https://github.com/castorini/rank_llm

yilinjz commented 9 months ago

Currently working on getting the following code to run on Colab:

from rank_llm.rank_vicuna import RankVicuna

rv = RankVicuna.from_pretrained_model("castorini/rank_vicuna_7b_v1")
query = 'What is the capital of the United States?'
docs = ['Carson City is the capital city of the American state of Nevada.',
     'The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.',
     'Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. ',
     'Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.'
     ]*100
results = rv.rerank(query=query, documents=docs)

Talked to @ronakice and the first step is to add the from_pretrained_model method to load the pretrained model. Will be working on this.

cc @lintool @sahel-sh

yilinjz commented 8 months ago

Updated 11/2/2023:

The rerank function has been implemented and added to RankVicuna in a recent PR (https://github.com/castorini/rank_llm/pull/24). This PR includes the following:

  1. Adds support for inline documents/hits and sample demos by creating a wrapper Retriever around Pyserini retriever.
  2. Adds a Reranker class to move reranking logic from run_rank_llm script to a module
  3. Adds a retrieve and rerank module to replicate run_rank_llm's script

Demos of the new rerank function can be found at https://github.com/castorini/rank_llm/tree/main/demo. To reproduce, please follow one of the two approaches below:

  1. If you have access to GPU on your local machine, take the following steps:

  2. If you do not have access to GPU on your local machine, the reproduction can be done on Colab. Follow this notebook for the steps: https://colab.research.google.com/drive/16SJ2r5F19mNvupRNGdGke6u2zgSVbgYI#scrollTo=vhYgQ_xiiwHg