LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
5.35k stars 364 forks source link

Missing SelfExtend options #1066

Open jojje opened 3 months ago

jojje commented 3 months ago

Describe the Issue llama.cpp exposes the options --grp-attn-n and --grp-attn-w for the Group size and Neighbor window size hyper parameters from the SelfExtend paper.

Without those parameters we're unable to extend the context size of models with short context sizes, such as Gemma2, without expensive model fine tuning.

Please consider exposing those options via the koboldcpp front-end as well.

Additional Information: N/A

LostRuins commented 3 months ago

You can extend the context size with --contextsize [desired max context length] which handles context size scaling automatically with Gradient Rope scaling.