microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
190 stars 21 forks source link

[Kernel] Extend Fast Decoding to UINT2 + QZeros #25

Closed LeiWang1999 closed 2 months ago

LeiWang1999 commented 2 months ago

fixed the issue : https://github.com/microsoft/BitBLAS/issues/24#issue-2263511731