issues
search
microsoft
/
BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
190
stars
21
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
WHLs for cuda 11.7, 11.8, and 12.0 for future Releases
#62
Qubitium
opened
1 hour ago
1
Fix gpu model missing from tvm target remap
#61
Qubitium
closed
4 hours ago
0
GH200 Support
#60
sidereior
closed
3 days ago
1
[FIX] Must validate ENV settings or wrong gpu selected by nvidia-smi
#59
Qubitium
closed
4 days ago
2
[FIX] GPU detection in multigpu env and OEM A100 not matching TVM
#58
Qubitium
closed
5 days ago
1
[Dev] Issue#24: FIx a bug of repack AutoGPTQ quantized parameters
#57
tzj-fxz
closed
1 week ago
1
Mismatch between Bitblas result and torch.matmul in QuickStart.md with batch size > 1
#56
MekkCyber
opened
1 week ago
2
does BitBLAS suport ROCM/AMD GPUS
#55
radna0
opened
1 week ago
2
[Dev] Fix a but within FP8 E4M3 Fast Decoding
#54
LeiWang1999
closed
2 weeks ago
0
[BugFix] Fix a bug in Static shape build
#53
LeiWang1999
closed
2 weeks ago
0
[Dev] Fix GEMV Dynamic Scheduling with Splitk
#52
LeiWang1999
closed
2 weeks ago
0
[Dev] Bump Version to 0.0.1.dev9
#51
LeiWang1999
closed
2 weeks ago
0
[Dev] Improve General Matmul With Splitk
#50
LeiWang1999
closed
2 weeks ago
0
[Dev] Bump Version to dev0.8 and fix issue INT8xINT2
#49
LeiWang1999
closed
3 weeks ago
0
[Feature] Enhancing MatmulOps with Splitk Support
#48
LeiWang1999
closed
3 weeks ago
0
Perplexity evaluation too high for 1bitLLM/bitnet_b1_58-3B
#47
MekkCyber
closed
2 weeks ago
18
[BUGFix] Fix UINT/INT8 dequantize implementation and optimize the schedule template for float32 accum
#46
LeiWang1999
closed
3 weeks ago
0
[Target] Improve TVM Target related items
#45
LeiWang1999
closed
3 weeks ago
0
BitNET training , nan
#44
robotzheng
closed
3 weeks ago
5
[DEV][FP8] Improve e4m3 decoding
#43
LeiWang1999
closed
1 month ago
0
[FP8] Support Weight Dequantize FP16xFP8_E4M3
#42
LeiWang1999
closed
1 month ago
0
How can I obtain the nearly 4x speedup of of W4A16 matrix-vector computation?
#41
ChenMnZ
closed
1 month ago
5
[Question] why is so slow to instantiate a bitblas linear layer
#40
ChenMnZ
closed
2 weeks ago
5
undefined symbol: ncclCommRegister
#39
robotzheng
closed
2 weeks ago
2
Update export.sh with pip installation command.
#38
Hamerlate
closed
3 weeks ago
1
NF4: Compilation Errors
#37
HanGuo97
closed
1 month ago
8
[BitNet] Disable accelerate for BitNET
#36
LeiWang1999
closed
1 month ago
0
Is int1 x float16 supported?
#35
chromecast56
closed
1 month ago
15
[BUG] Make sure the torch tensor is contiguous
#34
LeiWang1999
closed
1 month ago
0
Matrix multiplication is outputting unexpected values if W is transposed through PyTorch's `t()` function
#33
rokada-br
closed
1 month ago
2
[Bug] Improve the Default Config Value and fix a Bug for TensorCore Config with Small shapes
#32
LeiWang1999
closed
1 month ago
0
Exception during saving cache to db
#31
ostix360
closed
1 month ago
13
[FP8] Improve tensor adapter to support fp8 conversion between torch and numpy
#30
LeiWang1999
closed
1 month ago
0
[FP8] Support FP8 MatrixCore Code gen and related test
#29
LeiWang1999
closed
1 month ago
0
Support for int8xint8 matmul with scaling
#28
ruofan-wu
closed
1 month ago
5
About Model Bitnet
#27
littlefive5
closed
1 month ago
1
Bitnet is giving NaN for perplexity
#26
joey00072
closed
1 month ago
21
[Kernel] Extend Fast Decoding to UINT2 + QZeros
#25
LeiWang1999
closed
2 months ago
0
Cannot use uint2 x float16
#24
xzyaoi
closed
1 week ago
9
fix typos
#23
xzyaoi
closed
2 months ago
1
Add `torch` as a requirement
#22
mgoin
closed
2 months ago
1
[CUDA GRAPH] Support Cuda Stream in the Wrap Function
#21
LeiWang1999
closed
2 months ago
0
[FIX] Update README.md with corrected links and paths
#20
LeiWang1999
closed
2 months ago
0
[DOCS] Remove Some Figures
#19
LeiWang1999
closed
2 months ago
0
[DOCS] readme update.
#18
LeiWang1999
closed
2 months ago
0
[DOCS] update the documents of int range.
#17
LeiWang1999
closed
2 months ago
0
[DEV] Remove extra dependencies and refactor the tvm import.
#16
LeiWang1999
closed
2 months ago
0
[Dev] Refactor the range of INT Format to (-max_int_value - 1, max_int_value)
#15
LeiWang1999
closed
2 months ago
0
[DEV] Transform Codebase from Azure to GitHub
#14
LeiWang1999
closed
2 months ago
0
Bump transformers from 4.36.0 to 4.38.0
#13
dependabot[bot]
closed
2 months ago
1
Next