issues
search
microsoft
/
BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
423
stars
34
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[AMD][TL] Introduce K Pack and a Conflict Free swizzling into Matrix Core
#248
LeiWang1999
opened
2 days ago
0
[Dev][AMD] Support LDS and Flash Attention for AMD Backend
#247
LeiWang1999
closed
1 week ago
0
[Dev][AMD] Implement LDS Async Copy for CDNA Arch
#246
LeiWang1999
closed
1 week ago
0
[Docs] update the contributing's table of contents
#245
emmanuel-ferdman
closed
2 weeks ago
1
Example of bitblas/ladder with dtypes like int3, int5, int6, int7
#244
yaoyaoding
opened
2 weeks ago
4
[Dev] Fix illegal pass order
#243
LeiWang1999
closed
2 weeks ago
0
[Dev][Relax] Update Bitblas end2end tuning example with relax
#242
LeiWang1999
closed
2 weeks ago
0
[Dev] Fix some lint issues
#241
LeiWang1999
closed
2 weeks ago
0
[Dev] Enhance Infra for ROCM
#240
LeiWang1999
closed
2 weeks ago
0
[CI] Disable Benchmark workflow due to github action v4 updates
#239
LeiWang1999
closed
2 weeks ago
0
[Dev][HIP] Fix MFMA Codegen
#238
LeiWang1999
closed
2 weeks ago
0
[DEV][TL] Support AMD Matrix Code Implementation
#237
LeiWang1999
closed
2 weeks ago
0
[Dev] Enhance TileLang Backend and fix a bug for INT4xINT2
#236
LeiWang1999
closed
3 weeks ago
0
[Dev] Update News in Readme
#235
LeiWang1999
closed
3 weeks ago
0
[Dev][Bitnet] Implement Operator with INT4xINT4/INT2
#234
LeiWang1999
closed
3 weeks ago
0
[Dev][BitNET] Implement INT4xINT2 GEMM
#233
LeiWang1999
closed
3 weeks ago
0
[Dev][TL] Implement MMA INT4 Tensor Core and Correctness Test Case.
#232
LeiWang1999
closed
3 weeks ago
0
[Dev] Support Tile Lang INT8xINT8 TensorCore Macro
#231
LeiWang1999
closed
3 weeks ago
0
[Bugfix] Fix build bug due to submodule update
#230
LeiWang1999
closed
3 weeks ago
0
[Dev][Bugfix] Add target argument and remove override register for hip callback compile
#229
LeiWang1999
closed
4 weeks ago
0
[Dev] Add some tests and examples
#228
LeiWang1999
closed
1 month ago
0
[Issue 192] Tail split support for dynamic matmul
#227
tzj-fxz
closed
1 month ago
3
[Dev][TL] Following updates of Tile Language Backend
#226
LeiWang1999
closed
1 month ago
1
[Dev][AMD] Add AMD CDNA Arch
#225
Cunxiao2002
closed
2 weeks ago
8
[Dev][TL] Implement Tile Language Dequant Matmul and Test Case
#224
LeiWang1999
closed
1 month ago
2
[AMD][HIP] Add HIP Code Generation with Block Primitives from Composable kernel Tile
#223
LeiWang1999
closed
4 weeks ago
0
[Dev][TL] Enhance TL Paser to support flexible tile lang kernel implementation
#222
LeiWang1999
closed
1 month ago
1
[Feature Request][TL] Maybe need an annotation to disable part of the ast be translated into TL
#221
LeiWang1999
closed
1 month ago
3
[Dev] Disable smooth layout rewrite for buffer store in some case
#220
LeiWang1999
closed
1 month ago
1
[Bugfix] Enhance LowerAsyncCopy Pass to handle INT8 dma copy with predicate
#219
LeiWang1999
closed
1 month ago
0
[BUG] Dynamic symoblic may block the lowering phase of async copy
#218
LeiWang1999
closed
1 month ago
3
[Dev][TL] Decouple 3rdparty TileLang Backend with TVM
#217
LeiWang1999
opened
1 month ago
1
[TL] [Issue215] add simplify pass for TL and test script, fixing issue
#216
tzj-fxz
closed
1 month ago
1
[Feature Request] Enhance Simplification to remove unused function arguments
#215
LeiWang1999
closed
1 month ago
3
[Dev][TL] Integrate TL Dequant Implementation into BitBLAS OPs
#214
LeiWang1999
closed
1 month ago
1
[Dev][TL] Merge Hopper and Pipeline Modifications
#213
LeiWang1999
closed
1 month ago
1
[Dev] Add support and test case for Ladder Weight only Transformation Matmul Operator
#212
LeiWang1999
closed
1 month ago
0
[BUG] TVM PopenPoolExecutor may have some bugs on TL Scripts
#211
LeiWang1999
opened
1 month ago
3
[TL] [Doc] add flash attention usage document
#210
tzj-fxz
closed
1 month ago
1
[Feature Request] Parallel Primitive Should be enhanced to improve the performance for irregular shapes
#209
LeiWang1999
opened
1 month ago
0
[Feature Request] AMD HIP Backend Should be Migrate from Ladder branch
#208
LeiWang1999
closed
2 weeks ago
2
[TL] Adapt TL Hardware-aware Search Space with Roller
#207
LeiWang1999
closed
1 month ago
1
'
#206
LeiWang1999
closed
1 month ago
0
[Dev] Enhance Operator Cache to support multi-thread environments
#205
LeiWang1999
closed
1 month ago
0
[BUG] Database or tuning conflict with Multi-GPU Environment
#204
LeiWang1999
closed
1 month ago
3
Installation Failure on CUDA 11.7
#203
rustic-snob
closed
1 month ago
2
[TL] initial implement flashattention op in TL
#202
tzj-fxz
closed
1 month ago
3
[Dev][TL] Hardware Aware Tuning Examples with TL
#201
LeiWang1999
closed
1 month ago
0
[Dev][TL] Add TL BaseScheduler and Library Generator
#200
LeiWang1999
closed
1 month ago
1
[TL] Wrap TL Kernel with Scheduler
#199
LeiWang1999
closed
1 month ago
1
Next