issues
search
microsoft
/
BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
191
stars
21
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Dev] Potentially improve performance through block reduction
#63
LeiWang1999
closed
21 hours ago
1
WHLs for cuda 11.7, 11.8, and 12.0 for future Releases
#62
Qubitium
opened
4 days ago
1
Fix gpu model missing from tvm target remap
#61
Qubitium
closed
5 days ago
0
GH200 Support
#60
sidereior
closed
1 week ago
1
[FIX] Must validate ENV settings or wrong gpu selected by nvidia-smi
#59
Qubitium
closed
1 week ago
2
[FIX] GPU detection in multigpu env and OEM A100 not matching TVM
#58
Qubitium
closed
1 week ago
1
[Dev] Issue#24: FIx a bug of repack AutoGPTQ quantized parameters
#57
tzj-fxz
closed
2 weeks ago
1
Mismatch between Bitblas result and torch.matmul in QuickStart.md with batch size > 1
#56
MekkCyber
opened
2 weeks ago
2
does BitBLAS suport ROCM/AMD GPUS
#55
radna0
opened
2 weeks ago
2
[Dev] Fix a but within FP8 E4M3 Fast Decoding
#54
LeiWang1999
closed
3 weeks ago
0
[BugFix] Fix a bug in Static shape build
#53
LeiWang1999
closed
3 weeks ago
0
[Dev] Fix GEMV Dynamic Scheduling with Splitk
#52
LeiWang1999
closed
3 weeks ago
0
[Dev] Bump Version to 0.0.1.dev9
#51
LeiWang1999
closed
3 weeks ago
0
[Dev] Improve General Matmul With Splitk
#50
LeiWang1999
closed
3 weeks ago
0
[Dev] Bump Version to dev0.8 and fix issue INT8xINT2
#49
LeiWang1999
closed
3 weeks ago
0
[Feature] Enhancing MatmulOps with Splitk Support
#48
LeiWang1999
closed
3 weeks ago
0
Perplexity evaluation too high for 1bitLLM/bitnet_b1_58-3B
#47
MekkCyber
closed
3 weeks ago
18
[BUGFix] Fix UINT/INT8 dequantize implementation and optimize the schedule template for float32 accum
#46
LeiWang1999
closed
4 weeks ago
0
[Target] Improve TVM Target related items
#45
LeiWang1999
closed
4 weeks ago
0
BitNET training , nan
#44
robotzheng
closed
3 weeks ago
5
[DEV][FP8] Improve e4m3 decoding
#43
LeiWang1999
closed
1 month ago
0
[FP8] Support Weight Dequantize FP16xFP8_E4M3
#42
LeiWang1999
closed
1 month ago
0
How can I obtain the nearly 4x speedup of of W4A16 matrix-vector computation?
#41
ChenMnZ
closed
1 month ago
5
[Question] why is so slow to instantiate a bitblas linear layer
#40
ChenMnZ
closed
3 weeks ago
5
undefined symbol: ncclCommRegister
#39
robotzheng
closed
3 weeks ago
2
Update export.sh with pip installation command.
#38
Hamerlate
closed
4 weeks ago
1
NF4: Compilation Errors
#37
HanGuo97
closed
1 month ago
8
[BitNet] Disable accelerate for BitNET
#36
LeiWang1999
closed
1 month ago
0
Is int1 x float16 supported?
#35
chromecast56
closed
1 month ago
15
[BUG] Make sure the torch tensor is contiguous
#34
LeiWang1999
closed
1 month ago
0
Matrix multiplication is outputting unexpected values if W is transposed through PyTorch's `t()` function
#33
rokada-br
closed
1 month ago
2
[Bug] Improve the Default Config Value and fix a Bug for TensorCore Config with Small shapes
#32
LeiWang1999
closed
2 months ago
0
Exception during saving cache to db
#31
ostix360
closed
1 month ago
13
[FP8] Improve tensor adapter to support fp8 conversion between torch and numpy
#30
LeiWang1999
closed
2 months ago
0
[FP8] Support FP8 MatrixCore Code gen and related test
#29
LeiWang1999
closed
2 months ago
0
Support for int8xint8 matmul with scaling
#28
ruofan-wu
closed
2 months ago
5
About Model Bitnet
#27
littlefive5
closed
2 months ago
1
Bitnet is giving NaN for perplexity
#26
joey00072
closed
1 month ago
21
[Kernel] Extend Fast Decoding to UINT2 + QZeros
#25
LeiWang1999
closed
2 months ago
0
Cannot use uint2 x float16
#24
xzyaoi
closed
2 weeks ago
9
fix typos
#23
xzyaoi
closed
2 months ago
1
Add `torch` as a requirement
#22
mgoin
closed
2 months ago
1
[CUDA GRAPH] Support Cuda Stream in the Wrap Function
#21
LeiWang1999
closed
2 months ago
0
[FIX] Update README.md with corrected links and paths
#20
LeiWang1999
closed
2 months ago
0
[DOCS] Remove Some Figures
#19
LeiWang1999
closed
2 months ago
0
[DOCS] readme update.
#18
LeiWang1999
closed
2 months ago
0
[DOCS] update the documents of int range.
#17
LeiWang1999
closed
2 months ago
0
[DEV] Remove extra dependencies and refactor the tvm import.
#16
LeiWang1999
closed
2 months ago
0
[Dev] Refactor the range of INT Format to (-max_int_value - 1, max_int_value)
#15
LeiWang1999
closed
2 months ago
0
[DEV] Transform Codebase from Azure to GitHub
#14
LeiWang1999
closed
2 months ago
0
Next