Steps to reproduce
Load any model and use it until the conversation tokens is greater than the context window and over 98% of prompts will crash the model hard, mid-response, and forcibly eject it.
I tried changing the Conversation Overflow to all possible options, no effect. I tried loading smaller models, and trying different context windows sizes, and nothing helps.
My setup
Windows 11
32 GB RAM
AMD Radeon RX 6800 XT (16 GB VRAM)
AMD Ryzen 7 5800X 8-core (3.8 GHz base speed)
Screenshots https://cdn.discordapp.com/attachments/547061919807438852/1282088827745468477/image.png?ex=66de15c4&is=66dcc444&hm=7a70d3bb57886eb616fe23bb0acba683c437c14969aaac0807a23b6b30701e9e&
Steps to reproduce Load any model and use it until the conversation tokens is greater than the context window and over 98% of prompts will crash the model hard, mid-response, and forcibly eject it.
I tried changing the Conversation Overflow to all possible options, no effect. I tried loading smaller models, and trying different context windows sizes, and nothing helps.
My setup Windows 11 32 GB RAM AMD Radeon RX 6800 XT (16 GB VRAM) AMD Ryzen 7 5800X 8-core (3.8 GHz base speed)