microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
359 stars 29 forks source link

[Feature] Enhancing MatmulOps with Splitk Support #48

Closed LeiWang1999 closed 3 months ago

LeiWang1999 commented 3 months ago

This pull request introduces a number of changes across the python/bitblas package in order to improve the functionality of the BitBlas library. The changes include updates to the Rasterization and TensorCoreExtraConfig classes, modifications to the fast_decode_impl method, and the addition of the MatmulWithSplitK class.

Updates to Rasterization and TensorCoreExtraConfig classes:

Modifications to fast_decode_impl method:

Addition of MatmulWithSplitK class:

Other important changes: