castorini / rank_llm

Repository for prompt-decoding using LLMs (GPT3.5, GPT4, Vicuna, and Zephyr)
http://rankllm.ai
Apache License 2.0
273 stars 35 forks source link

p2- expose pyserini retriever's top k candidate to retriever with k=100 being the default value #81

Closed sahel-sh closed 5 months ago

sahel-sh commented 5 months ago

Currently, retriever.from_.. for dataset and custom index does not take k as a parameter, they should be able to take it and pass it down to pyserini. The candidate file names used for storing and reusing the retrieved_results should also have this parameter included, so that retrieving top 20 does not create a false collision with retriving top 100 by having the same file name.

sahel-sh commented 5 months ago

@jasper-xian given your previous cl about pyserini_retriever, you are a good candidate for this change which would be much simpler. Do you have the bandwidth to work on this?

jasper-xian commented 5 months ago

yup I can take this

sahel-sh commented 5 months ago

it is yours, thank you!

sahel-sh commented 5 months ago

Thank you @jasper-xian for working on this!