issues
search
HandH1998
/
QQQ
QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
https://arxiv.org/pdf/2406.09904
91
stars
8
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
关于安装问题
#25
qingkongby
opened
2 weeks ago
1
Transformer 4.46.1 compat
#24
Qubitium
opened
3 weeks ago
5
关于qwen2-1.5b模型的问题
#23
darrenearl
opened
1 month ago
5
Question about building W4A8 on AMD platform
#22
XIAOHUIL1
closed
1 month ago
2
rotate + lm_head quantization
#21
RanchiZhao
opened
1 month ago
1
rotation+gptq data
#20
Andy0422
opened
1 month ago
7
关于Marlin fetch_to_registers的问题
#19
darrenearl
opened
1 month ago
0
bugs: qqq_gemm.cu(183): error: identifier "__hfma2" is undefined
#18
Andy0422
closed
1 month ago
1
Qwen2-1.5B 量化后精度完全不可用
#17
Juelianqvq
closed
1 month ago
18
Qwen2-72B-Instruct packing failed
#16
Juelianqvq
closed
2 months ago
2
Condition to achieve linear speedup?
#15
jiwonsong-dev
opened
2 months ago
18
Qwen2 supported?
#14
Juelianqvq
closed
2 months ago
5
Question on rotation
#13
cli99
opened
3 months ago
7
Does QQQ linear support H100?
#12
donglinz
closed
2 months ago
1
Plz share some calibration dataset or examples
#11
skykiseki
closed
2 months ago
2
关于group_size的问题
#10
darrenearl
opened
3 months ago
1
Possibility of using different group size setting
#9
NicoNico6
opened
3 months ago
6
smooth.py报错
#8
darrenearl
closed
2 months ago
1
使用QQQ W4A8量化后的模型好像有问题。。。
#7
Zhao-Dongyu
closed
3 months ago
2
Can MLA be smoothed?
#6
RanchiZhao
closed
4 months ago
5
[New Model Supported] MiniCPM-2.4B
#5
RanchiZhao
closed
4 months ago
3
What is the prior for loss/error?
#4
RanchiZhao
closed
4 months ago
1
[QST] Speedup of GEMM
#3
Hongbosherlock
closed
4 months ago
24
[QST] Scale factors and benchmarks
#2
jeromeku
closed
4 months ago
30
How to use custom calib data?
#1
Juelianqvq
closed
5 months ago
7