Closed 3dfactor closed 1 month ago
Shouldn't be too hard, I have some comments for the creator of this sampler though, following up in your link
Will be added in 1.74
https://github.com/LostRuins/koboldcpp/commit/5bf527a6aec241249793be17e4e3b7a0dbed59b2
@LostRuins
Please see my comments in that commit.
I would recommend to wait until the parameter discussion in the original PR has been resolved before releasing this in Kobold, to avoid potentially diverging implementations.
Yeap sure @p-e-w , it's not live yet. I saw your comments on https://github.com/LostRuins/koboldcpp/commit/5bf527a6aec241249793be17e4e3b7a0dbed59b2#r145630779 and will address them.
Particularly the part about keeping the tail, this line
gave me the impression that only one candidate should remain at the end. But now I think you're saying I should not touch any tokens below the xtc_threshold (ie. leave them as-is) correct?
So the final result is only a warping of the (n-1) out of n tokens above the threshold (if multiple exist) or nothing at all (if n<=1), no truncation exists in both cases.
@LostRuins
But now I think you're saying I should not touch any tokens below the xtc_threshold (ie. leave them as-is) correct?
So the final result is only a warping of the (n-1) out of n tokens above the threshold (if multiple exist) or nothing at all (if n<=1), no truncation exists in both cases.
Yes, that's correct. The text from the image you cut out is intended to supplement the two bar charts, where you can see that the only tokens that are removed (faded out) are the ones above the threshold. A more unambiguous version of the last line would be
...remove all tokens above the threshold, except the least probable one, from sampling
Closing as added in latest version.
Exclude Top Choices (XTC) sampling algorithm is a novel sampler that turns truncation on its head: Instead of pruning the least likely tokens, under certain circumstances, it removes the most likely tokens from consideration.
More precisely, it removes all except the least likely token meeting a given threshold, with a given probability. This ensures that at least one "viable" choice remains, retaining coherence. Truncation samplers can be applied as usual, preventing garbage from being sampled. The result is coherent output (because truncation removes bad tokens) with unprecedented creativity (because XTC removes "boring" tokens).
The oobabooga implementation can be found here along with eloquent description: https://github.com/oobabooga/text-generation-webui/pull/6335