LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
5.14k stars 353 forks source link

[BUG] Using Longchat Model generating gibberish #349

Closed racheandre closed 1 year ago

racheandre commented 1 year ago

When using Longchat model with latest version 94e0a06daf9605b3fe23252bf3e4aa123f6027f4, it will generate gibberish. I find that using the version e1a7042943a0016dc979554372641112150dc346 at 2nd July worked, though it may not be the version that has the breaking changes, since I don't have time to find it.

Chat Result

 Human: hi
 AI: :::::::::::::::::::::::::::

Environment

LostRuins commented 1 year ago

Hi, the model is fine, koboldcpp just has had some changes to the way RoPE works.

For this model, you will now need to use linear rope with --ropeconfig instead. Thus, please run with the command:

python koboldcpp.py --stream --model longchat-13b-16k.ggmlv3.q3_K_S.bin --useclblast 0 0 --contextsize 8192 --gpulayers 43 --threads 4 --ropeconfig 0.25 10000

Please let me know if this works for you.

racheandre commented 1 year ago

It works, Thank you! Seems like I missed another big thing without touch LLM for a month. Does the dynamic RoPE settings can be applied to the new LLAMA2 model with 4k context size?