microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
428 stars 34 forks source link

[Dev][TL] Hardware Aware Tuning Examples with TL #201

Closed LeiWang1999 closed 2 months ago

LeiWang1999 commented 2 months ago

This pull request includes several changes to improve the scheduling and tuning capabilities in the bitblas module, along with some code refactoring and cleanup. The most important changes include updating the ThreadPoolExecutor usage, adding hardware-aware configuration methods, introducing a new fine-grained matrix multiplication scheduler, and making various code style improvements.

Enhancements to Scheduling and Tuning:

New Scheduler Introduction:

Code Refactoring and Cleanup: