vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
29.92k stars 4.52k forks source link

Support for Contrastive Search #1219

Open abdulvirta opened 1 year ago

abdulvirta commented 1 year ago

Does vllm supports contrastive search? If not, would be great to add that support as soon as possible? Research shows that this improves model quality significantly and it is currently supported in transformers HF library. Would be great to get parity for this in vllm.

viktor-ferenczi commented 1 year ago

Related to #535

abdulvirta commented 1 year ago

so would the implementation of that ticket enable the end user to be able to implement contrastive search on their side? or do you mean that the contrastive search support work needs that work to happen first?

Peter-Devine commented 1 year ago

I may be mistaken, but implementing contrastive search would require more than just enabling logit processing.

Contrastive search, from my understanding, requires access to the last hidden state of each previous token. From the paper linked above, we can see image where h is the last hidden state.

Looking at Huggingface's implementation of Contrastive Search, particularly the _ranking_fast function, we can see that you need the hidden states to calculate the output. Is access to the last hidden states of each previous token even possible using paged attention? If not, I would guess that even having access to some of the hidden states would still result in generation similar to contrastive search (this is just a hypothesis though).

Anyway, thanks for the great package :)

YecanLee commented 2 months ago

Any update on adding this feature?👀