LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.66k stars 334 forks source link

[Vulkan NoAVX2] Generating nonsense after Context Shifting #712

Closed Denplay195 closed 5 months ago

Denplay195 commented 5 months ago

Only using Vulkan NoAVX2, after context shifting (both ways tried as you can see pic related), the further generations are being complete nonsense (or just one repeating word if Mirostat 2 is on) no matter what I do (with prompt or with settings/models)

image

image

P.S. Thanks for adding this backend! It turns out to be much more memory hungry and just hot on my machine, but it is more than twice faster for much I've tested 🔥

LostRuins commented 5 months ago

What model is this?

Denplay195 commented 5 months ago

What model is this?

I guess it's Noromaid-13B-0.4-DPO.q4_k_m, but I've got several models with the same result.

Other ones were just typing out different symbols and words (Russian, Chinese, French) without spaces and given context to do this before.

The list of other models that I've tested with Vulkan NoAVX2: toxichermes-2.5-mistral-7b.Q5_K_M, Misted-7B-Q5_0, Fimbulvetr-11B-v2-Test-14.q5_K_M, marx-3b-v3.Q8_0, kunoichi-dpo-v2-7b.Q5_K_M, silicon-maid-7b.Q5_K_S

Denplay195 commented 5 months ago

After latest updates it somehow got better using the same models and settings LuL Still may be happening to 13b models much, but not to 7b anymore