BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
190
stars
21
forks
source link
[FP8] Improve tensor adapter to support fp8 conversion between torch and numpy #30
Closed
LeiWang1999 closed 1 month ago