LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.97k stars 349 forks source link

Missing SelfExtend options #1066

Open jojje opened 1 month ago

jojje commented 1 month ago

Describe the Issue llama.cpp exposes the options --grp-attn-n and --grp-attn-w for the Group size and Neighbor window size hyper parameters from the SelfExtend paper.

Without those parameters we're unable to extend the context size of models with short context sizes, such as Gemma2, without expensive model fine tuning.

Please consider exposing those options via the koboldcpp front-end as well.

Additional Information: N/A

LostRuins commented 1 month ago

You can extend the context size with --contextsize [desired max context length] which handles context size scaling automatically with Gradient Rope scaling.