microsoft BitBLAS issues

microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

MIT License

423 stars 34 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[AMD][TL] Introduce K Pack and a Conflict Free swizzling into Matrix Core

#248 LeiWang1999 opened 2 days ago
0
[Dev][AMD] Support LDS and Flash Attention for AMD Backend

#247 LeiWang1999 closed 1 week ago
0
[Dev][AMD] Implement LDS Async Copy for CDNA Arch

#246 LeiWang1999 closed 1 week ago
0
[Docs] update the contributing's table of contents

#245 emmanuel-ferdman closed 2 weeks ago
1
Example of bitblas/ladder with dtypes like int3, int5, int6, int7

#244 yaoyaoding opened 2 weeks ago
4
[Dev] Fix illegal pass order

#243 LeiWang1999 closed 2 weeks ago
0
[Dev][Relax] Update Bitblas end2end tuning example with relax

#242 LeiWang1999 closed 2 weeks ago
0
[Dev] Fix some lint issues

#241 LeiWang1999 closed 2 weeks ago
0
[Dev] Enhance Infra for ROCM

#240 LeiWang1999 closed 2 weeks ago
0
[CI] Disable Benchmark workflow due to github action v4 updates

#239 LeiWang1999 closed 2 weeks ago
0
[Dev][HIP] Fix MFMA Codegen

#238 LeiWang1999 closed 2 weeks ago
0
[DEV][TL] Support AMD Matrix Code Implementation

#237 LeiWang1999 closed 2 weeks ago
0
[Dev] Enhance TileLang Backend and fix a bug for INT4xINT2

#236 LeiWang1999 closed 3 weeks ago
0
[Dev] Update News in Readme

#235 LeiWang1999 closed 3 weeks ago
0
[Dev][Bitnet] Implement Operator with INT4xINT4/INT2

#234 LeiWang1999 closed 3 weeks ago
0
[Dev][BitNET] Implement INT4xINT2 GEMM

#233 LeiWang1999 closed 3 weeks ago
0
[Dev][TL] Implement MMA INT4 Tensor Core and Correctness Test Case.

#232 LeiWang1999 closed 3 weeks ago
0
[Dev] Support Tile Lang INT8xINT8 TensorCore Macro

#231 LeiWang1999 closed 3 weeks ago
0
[Bugfix] Fix build bug due to submodule update

#230 LeiWang1999 closed 3 weeks ago
0
[Dev][Bugfix] Add target argument and remove override register for hip callback compile

#229 LeiWang1999 closed 4 weeks ago
0
[Dev] Add some tests and examples

#228 LeiWang1999 closed 1 month ago
0
[Issue 192] Tail split support for dynamic matmul

#227 tzj-fxz closed 1 month ago
3
[Dev][TL] Following updates of Tile Language Backend

#226 LeiWang1999 closed 1 month ago
1
[Dev][AMD] Add AMD CDNA Arch

#225 Cunxiao2002 closed 2 weeks ago
8
[Dev][TL] Implement Tile Language Dequant Matmul and Test Case

#224 LeiWang1999 closed 1 month ago
2
[AMD][HIP] Add HIP Code Generation with Block Primitives from Composable kernel Tile

#223 LeiWang1999 closed 4 weeks ago
0
[Dev][TL] Enhance TL Paser to support flexible tile lang kernel implementation

#222 LeiWang1999 closed 1 month ago
1
[Feature Request][TL] Maybe need an annotation to disable part of the ast be translated into TL

#221 LeiWang1999 closed 1 month ago
3
[Dev] Disable smooth layout rewrite for buffer store in some case

#220 LeiWang1999 closed 1 month ago
1
[Bugfix] Enhance LowerAsyncCopy Pass to handle INT8 dma copy with predicate

#219 LeiWang1999 closed 1 month ago
0
[BUG] Dynamic symoblic may block the lowering phase of async copy

#218 LeiWang1999 closed 1 month ago
3
[Dev][TL] Decouple 3rdparty TileLang Backend with TVM

#217 LeiWang1999 opened 1 month ago
1
[TL] [Issue215] add simplify pass for TL and test script, fixing issue

#216 tzj-fxz closed 1 month ago
1
[Feature Request] Enhance Simplification to remove unused function arguments

#215 LeiWang1999 closed 1 month ago
3
[Dev][TL] Integrate TL Dequant Implementation into BitBLAS OPs

#214 LeiWang1999 closed 1 month ago
1
[Dev][TL] Merge Hopper and Pipeline Modifications

#213 LeiWang1999 closed 1 month ago
1
[Dev] Add support and test case for Ladder Weight only Transformation Matmul Operator

#212 LeiWang1999 closed 1 month ago
0
[BUG] TVM PopenPoolExecutor may have some bugs on TL Scripts

#211 LeiWang1999 opened 1 month ago
3
[TL] [Doc] add flash attention usage document

#210 tzj-fxz closed 1 month ago
1
[Feature Request] Parallel Primitive Should be enhanced to improve the performance for irregular shapes

#209 LeiWang1999 opened 1 month ago
0
[Feature Request] AMD HIP Backend Should be Migrate from Ladder branch

#208 LeiWang1999 closed 2 weeks ago
2
[TL] Adapt TL Hardware-aware Search Space with Roller

#207 LeiWang1999 closed 1 month ago
1
'

#206 LeiWang1999 closed 1 month ago
0
[Dev] Enhance Operator Cache to support multi-thread environments

#205 LeiWang1999 closed 1 month ago
0
[BUG] Database or tuning conflict with Multi-GPU Environment

#204 LeiWang1999 closed 1 month ago
3
Installation Failure on CUDA 11.7

#203 rustic-snob closed 1 month ago
2
[TL] initial implement flashattention op in TL

#202 tzj-fxz closed 1 month ago
3
[Dev][TL] Hardware Aware Tuning Examples with TL

#201 LeiWang1999 closed 1 month ago
0
[Dev][TL] Add TL BaseScheduler and Library Generator

#200 LeiWang1999 closed 1 month ago
1
[TL] Wrap TL Kernel with Scheduler

#199 LeiWang1999 closed 1 month ago
1