Feature request: DRY support

A new sampler named 'DRY' appears to be a much better way of handling repetition in model output than the crude repetition penalty exposed by koboldcpp.

https://github.com/oobabooga/text-generation-webui/pull/5677/commits/b79688423b058f55f2f14faac1ff333eecad4652

It works as follows;

Specify options dry_multiplier = 0.8, dry_allowed_length = 2, dry_base = 1.75 Look at the current, longest matching sequence, stopping at any of the stop tokens (quotes, asterisks, newlines). Say it's of length seq_len. Apply a penalty of dry_multiplier * dry_base^(seq_len - dry_allowed_length). (Only consider tokens in the repetition penalty range). Because this penalty is exponential, longer verbatim repeating sequences are heavily penalized. The ignore tokens prevent the sampler from messing with the formatting, while the allowed length without penalty stops the repetition penalty from messing with the grammar of responses.

LostRuins / koboldcpp

Feature request: DRY support #943