LostRuins / koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.35k stars 312 forks source link

Repetition Range and Slope Doesn't Work as Tooltips and Docs Suggest #884

Open Reithan opened 1 month ago

Reithan commented 1 month ago

Documentation seems to conflict over whether Rep Range being set to 0 or -1 disables range being used, testing with lite.koboldai UI, -1 automatically corrects to 0, and looking at code, 0 completely disables rep penalty.

Looking at https://github.com/LostRuins/koboldcpp/blob/v1.66/llama.cpp#L14104 it appears that slope isn't used at all by Llama models, and looking at https://github.com/LostRuins/koboldcpp/blob/v1.66/gpttype_adapter.cpp#L426 it appears that slope is applied once at the start of rep range and never updated, meaning it's just used as a constant scale, rather than actually as a slope.

EDIT: Update Issue title to match new info

LostRuins commented 1 month ago

The way the slope works is different from a gradual slope that the original KoboldAi uses.

The slope in koboldcpp works like this: The tokens in the rep pen range are divided into two groups, near and far. Then rep_pen is applied to tokens from the 'near' group, whereas rep_pen*slope is applied to tokens from the 'far' group. So it mimics the effects of a slope but is more coarse in nature.

Reithan commented 1 month ago

Nice. Ok, so the behavior is worked as intended, however it's probably still needed to update tooltips and documentation on the git wikis for koboldcpp and lite.koboldai.net to match the implemented behavior. ESPECIALLY the ones that say that -1/0 disables the Rep Range (it disable Rep Penalty entirely instead).

LostRuins commented 1 month ago

Disabling rep pen range and disabling rep pen has the same effect, doesn't it? You cant have a rep pen with a 0 range.

Reithan commented 1 month ago

Frustratingly I can't find it now, but there was a tooltip or wiki someplace on here that said set range to 0 to disable, or set to -1 to take the whole context. The -1 autocorrects to 0 and disables it.

Also "Disable Range" isn't explicitly disabling the penalty, so I wasn't clear if that mean it would take some minimum range, only count the generated text, or something else.