[X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[X] I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Hi team,
First of all, I'm grateful you guys keep improving this awesome project. I just discovered that using Vulkan backend on Linux or FreeBSD using Mesa Vulkan driver, the performance for Gemma-2-9B model is 4X slower than Llama-3-8B model: here's the results:
Prerequisites
Feature Description
Hi team, First of all, I'm grateful you guys keep improving this awesome project. I just discovered that using Vulkan backend on Linux or FreeBSD using Mesa Vulkan driver, the performance for Gemma-2-9B model is 4X slower than Llama-3-8B model: here's the results:
Here's my setup:
OS: FreeBSD-15-Current GPU Driver: drm-6.1-lts and mesa radv driver CPU: dual socket E5-2680v4 GPU: AMD 7900XT(20GB)
Motivation
Gemma-2 model is a high quality model for it's size. And vulkan backend optimization is very good addition
Possible Implementation
No response