Closed ikawrakow closed 1 month ago
This PR improves Q4_0 and Q8_0 performance on AVX2 and Zen4. The table shows comparisons to llama.cpp for LLaMA-3.1-8B on a Ryzen-7950X (Zen4) and a Ryzen-5975WX (AVX2) CPU.
Q4_0
Q8_0
AVX2
Zen4
llama.cpp
This PR improves
Q4_0
andQ8_0
performance onAVX2
andZen4
. The table shows comparisons tollama.cpp
for LLaMA-3.1-8B on a Ryzen-7950X (Zen4) and a Ryzen-5975WX (AVX2) CPU.