Closed Nexesenex closed 3 months ago
The PR works as it is beyond the previous 5k context cap for Gemma V2 softcap. Slaren just wants to simplify the code.
I'll think i will wait to merge this when it's merged in llama.cpp
And merged it is, with a further drop in perplexity ! :D
Expand the usable context from 5k to 8k.
https://github.com/ggerganov/llama.cpp/pull/8227
Additional commits :
fix data_swa uninitialized better naming add co-author