microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
190 stars 21 forks source link

[Dev] Issue#24: FIx a bug of repack AutoGPTQ quantized parameters #57

Closed tzj-fxz closed 2 weeks ago

tzj-fxz commented 2 weeks ago

Issue#24: FIx a bug of repack AutoGPTQ quantized parameters modified: python/bitblas/module/init.py add "zero_mode=quantized" & asymmetric zeros of "source_format=uint" testing

tzj-fxz commented 2 weeks ago

@microsoft-github-policy-service agree