GPU: AMD Radeon RX 6800 XT
Driver: AMD proprietary driver (version 2.0.317)
Vulkan API version: 1.3.292
Model tested: Mistral-7b
Performance: ~25 tokens/sec (down from 60 tokens/sec in v0.5.7)
GPU Layers: 37/37 layers offloaded to GPU (confirmed by cortex.log)
Abnormal behavior: High CPU usage (~90%) despite GPU offloading
Performance regression observed with AMD GPUs (specifically RX 6800 XT) where GPU utilization is lower than expected and CPU usage is abnormally high (~90%) compared to previous versions. Despite GPU layers being offloaded correctly, the performance is significantly slower (25t/s vs previous 60t/s with Mistral-7b-v0.3).
Steps to Reproduce
Install Jan v0.5.8 on system with AMD GPU
Enable Vulkan support
Load Mistral model with ngl set to 100
Observe:
High CPU usage (~90%)
Lower tokens/sec compared to v0.5.7
GPU not fully utilized despite cortex.log showing layers offloaded
Additional Context
Issue persists after factory reset
Qwen 32GB is a large model compared to this device specifications. Vulkan is also not very stable to achieve that speed in Vulkan mode. We need to reproduce the issue to ensure it is a bug.
Jan version
0.5.8
Describe the Bug
https://discord.com/channels/1107178041848909847/1306758623325851689
Performance regression observed with AMD GPUs (specifically RX 6800 XT) where GPU utilization is lower than expected and CPU usage is abnormally high (~90%) compared to previous versions. Despite GPU layers being offloaded correctly, the performance is significantly slower (25t/s vs previous 60t/s with
Mistral-7b-v0.3
).Steps to Reproduce
Additional Context Issue persists after factory reset
Screenshots / Logs
cortex.log app.log
What is your OS?