Possible error in top-p filtering

dacorvo commented 9 months ago

Looking at the code of the top_k_top_p_filtering method in sampling.py, I am wondering if the algorithm for applying the top-p filtering is correct.

Unlike the transformers implementation, the algorithm performs a cumulative sum on logits probabilities sorted in descending order, which seems to lead to a different selection.

Example:

scores = [0.1, 0.2, 0.7]
top_p = 0.8
top_p_scores = [0.2, 0.7]

transformers algorithm

sorted_scores = [0.1, 0.2, 0.7]
cum_sum = [0.1, 0.3, 1.0]
top_p_scores = top_p_scores[cum_sum > (1 - top_p)] = [0.2, 0.7] <- correct

transformers-neuronx algorithm

sorted_scores = [0.7, 0.2, 0.1]
cum_sum = [0.7, 0.9, 1.0]
top_p_scores = top_p_scores[cum_sum <= top_p] = [0.7] <-incorrect

I checked the result by crafting a sample:

>>> import torch
>>> from transformers_neuronx.sampling import top_k_top_p_filtering
>>> probs = torch.tensor([[0.1, 0.2, 0.7]])
>>> logits = torch.log(probs)
>>> top_p_logits, top_p_indices = top_k_top_p_filtering(logits, top_k=None, top_p=0.8)
>>> print(top_p_indices)
tensor([[2]])

aws-rhsoln commented 9 months ago

Thank you for reporting. We are looking into it, will get back on this soon

aws-rhsoln commented 9 months ago

We identified the issue, and we would have a fix in the upcoming release.

gsnaws commented 6 months ago

@dacorvo I believe this is fixed with the latest 2.16. Please try and let us know if you are still facing any issues.

dacorvo commented 5 months ago

Actually, I don't use it: it is just that going through the code I noticed a difference with what is done in transformers.

aws-rhsoln commented 4 months ago

Closing the issue as its suppose to be fixed with 2.16 release. Please re-open if you find another issue.

aws-neuron / transformers-neuronx

Possible error in top-p filtering #46