Open pl2k2000 opened 5 months ago
Yes it uses more RAM when you initialise a model with more context. Wasn't 2048 enough for context?
My point is that I have enough Ram for the model plus 4096 context. I have no issue/crash when used other local LLM apps with 4096 context on the same model. My model is only 8GB in size and I have 24GB Ram. Tell me why it is not enough?
Gotcha I'll take a look
I am on Macos M2 with 24GB Ram and loaded mixtral_7bx2_moe.Q8_0.gguf or guanaco-13b-uncensored.Q4_K_M.gguf
If I select context length to 4096 it will crash when open the chat window. Context length 2048 works fine. System showing more than 16GB memory avaliable when started Reor.