issues
search
bytedance
/
decoupleQ
A quantization algorithm for LLM
Apache License 2.0
86
stars
5
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
LinearA16 and LinearW2A16 output does not match
#12
XLuoxing
opened
1 week ago
6
decoupleQ.decoupleQ_kernels
#11
hsb1995
opened
1 month ago
1
矩阵乘性能数据
#10
yyfcc17
closed
4 days ago
2
RuntimeError: Unsupported compute type Float
#9
ChuanhongLi
opened
1 month ago
2
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
#8
ChuanhongLi
opened
1 month ago
8
关于加速效果
#7
hikq123
opened
1 month ago
4
量化好几个小时候出现报错,网络问题:ConnectionError
#6
chuangzhidan
opened
2 months ago
20
run_inference_llama 问题请教
#5
ChuanhongLi
closed
1 month ago
1
如何用在custom model上?
#4
tianyma
opened
2 months ago
5
量化后nan问题
#3
huyiming2018
opened
2 months ago
4
add llama w2 infer demo
#2
MyPandaShaoxiang
closed
2 months ago
1
关于量化模型推理
#1
ChuanhongLi
opened
2 months ago
16