vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.15k stars 4.73k forks source link

[Feature]: Support rerank models #6928

Open etwk opened 4 months ago

etwk commented 4 months ago

🚀 The feature, motivation and pitch

Rerank models are essential to RAG workflow. There are quite a few models available, such as jina-reranker-v2. Some inference frameworks already support rerank models, for example, https://inference.readthedocs.io/en/latest/models/builtin/rerank/index.html

Do we have plans to add support for this? What are the main steps if someone tries to implement?

chengxiangwang commented 2 months ago

+1

cyberluke commented 1 month ago

+1

gavrissh commented 2 weeks ago

+1

nagar-ajay commented 2 weeks ago

+1