Open belerico opened 7 months ago
Thanks for suggesting and offering to contribute
In short, instead of selecting a hard number of samples to like in top k, it selects the number of samples such that they don't exceed a threshold p. I think this is a popular standard technique and could potentially be added as an option for litgpt chat
analogous and in addition to the top_k
setting.
It would be a nice contribution.
What do you think @awaelchli @carmocca ?
I agree
Nucleus sampling (top-p sampling in HF) is a dynamic sampling strategy that "truncat[es] the unreliable tail of the probability distribution, sampling from the dynamic nucleus of tokens containing the vast majority of the probability mass.". It can be easily implemented in the sample method like this:
I can open a PR with this add if this is considered useful