Open YW5555 opened 2 months ago
Are you sure it's not a threading issue instead? If you keep koboldcpp in the foreground with --foreground
does it still happen? I'm not sure GPU driver power throttling behavior can be controlled from the program, could be due to your power options e.g. battery mode. Perhaps you need to config it in the AMD control panel.
My AMD 6900XT VRAM would idle to 0mhz after a while and this would tank the performance of koboldcpp. The idling VRAM doesn't affect other workloads like gaming because the VRAM frequency would pop back to 2000mhz. But generating with koboldcpp fails to 'wake up' the VRAM, tanking my generation speed to ~7 tk/s instead of the usual ~50tk/s. My solution is to turn off Freesync in the AMD driver which forces the VRAM frequency to stay at the max 2000mhz clock rate all the time.
This issue has been happening in the past few months with different GGUF, Windows update, AMD driver and koboldcpp versions (main with Vulkan and ROCm fork) combinations. All GGUF layers and context are completely offloaded to VRAM (~11GB).
Current hardware: OS: Windows 11 Pro 23H2 GPU: Powercolor AMD 6900XT Reference GPU Driver: 24.5.1 CPU: AMD 5800X