issues
search
IST-DASLab
/
QUIK
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
Apache License 2.0
172
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
quik.matmul.int8SpMatmul Question
#15
yeliang2258
opened
3 months ago
0
int8FusedDequantizeCUDA Inference Results are Incorrect
#14
zkf331
opened
5 months ago
0
Fix wheel building problem
#13
guoyuhong
opened
7 months ago
0
how to build cutlass library only with kernels that used in QUIK?
#12
ThisisBillhe
opened
10 months ago
0
[Question] Does QUIK support muiti-batch inference?
#11
hanrui1sensetime
opened
10 months ago
0
[Question] How to get act_scales of custom llama-like model? How much calibration data items do we need? Need act_zeros simultaneously?
#10
hanrui1sensetime
opened
11 months ago
3
why there is a half range shift?
#9
yyfcc17
closed
11 months ago
6
Update README.md
#8
eltociear
closed
1 year ago
0
Cloning with HTTPS in README
#7
BlackSamorez
closed
1 year ago
1
Please add license file
#6
xnorai
closed
1 year ago
2
add y in asy fused dequant
#5
xcwang1999
closed
1 year ago
0
Asy dequant fusion
#4
xcwang1999
closed
1 year ago
0
add dequant fusion function
#3
GitHubbeer
closed
1 year ago
0
add dequant fusion function
#2
xcwang1999
closed
1 year ago
0
update
#1
xcwang1999
closed
1 year ago
0