issues
search
turboderp
/
exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.28k
stars
243
forks
source link
Difference between gemm_half_q_half_gptq_kernel and gemm_half_q_half_kernel
#202
Closed
frankxyy
closed
7 months ago
frankxyy
commented
7 months ago
It seems both are q gemm... What is the difference?
It seems both are q gemm... What is the difference?