Open abdulvirta opened 1 year ago
Related to #535
so would the implementation of that ticket enable the end user to be able to implement contrastive search on their side? or do you mean that the contrastive search support work needs that work to happen first?
I may be mistaken, but implementing contrastive search would require more than just enabling logit processing.
Contrastive search, from my understanding, requires access to the last hidden state of each previous token. From the paper linked above, we can see where h is the last hidden state.
Looking at Huggingface's implementation of Contrastive Search, particularly the _ranking_fast function, we can see that you need the hidden states to calculate the output. Is access to the last hidden states of each previous token even possible using paged attention? If not, I would guess that even having access to some of the hidden states would still result in generation similar to contrastive search (this is just a hypothesis though).
Anyway, thanks for the great package :)
Any update on adding this feature?👀
Does vllm supports contrastive search? If not, would be great to add that support as soon as possible? Research shows that this improves model quality significantly and it is currently supported in transformers HF library. Would be great to get parity for this in vllm.