Closed DutchEllie closed 6 months ago
Maybe because you use the Alpaca instruction template?
It happens using ChatML as well.
Have you ever successfully used your Mixtral K quants? [https://cdn-uploads.huggingface.co/production/uploads/63ab1241ad514ca8d1430003/TvjEP14ps7ZUgJ-0-mhIX.png](People have been having a lot of problems with them, including me.) I would suggest trying a Q5_0 or something similar, as those quants seem to be working fine. I would also suggest trying the 4x7b model as I have not had nearly as many headaches with it.
I only used K-quants for now, but despite having these issues and it being fixed by decreasing the context size, I have not tried the non-K versions. I will do that I guess. I never checked the quality of the generated text, so idk if it's any good.
Will check when I have the time
Same here
Edit: fixed for me now
This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.
Describe the bug
I am on the
dev
branch right now! Very important to note.I loaded
mistral-7b-instruct-v0.1.Q5_K_M.gguf
andmixtral-8x7b-instruct-v0.1.Q5_K_M.gguf
using llama.cpp and offloading some layers onto my RX 7900 XTX. When then_ctx
is set to 32768 (or presumably higher as well) the output when using the chat is gibberish. It also happens using ChatML and Alpaca in default and keeps happening anywhere until I reload the model. Strangely enough, when a system prompt is not provided in default, it works, but when a system prompt is provided, it breaks? Also, even though I set then_ctx
to this number, I am not actually providing that many tokens!! I am working with just a couple hundred tokens at most.Anyway, setting the
n_ctx
to anything lower than 32768 will restore functionality.This does not seem to happen when I use a different model though? I also loaded up tinyllama and set its
n_ctx
to 32768, sent out similar prompts, it works.Is there an existing issue for this?
Reproduction
requirements_amd.txt
)mistral-7b-instruct-v0.1.Q5_K_M.gguf
and set then_ctx
to 32768.Screenshot
Logs
nothing special really..
System Info