HandH1998 QQQ issues - Githubissues

HandH1998 / QQQ

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

https://arxiv.org/pdf/2406.09904

91 stars 8 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

关于安装问题

#25 qingkongby opened 2 weeks ago
1
Transformer 4.46.1 compat

#24 Qubitium opened 3 weeks ago
5
关于qwen2-1.5b模型的问题

#23 darrenearl opened 1 month ago
5
Question about building W4A8 on AMD platform

#22 XIAOHUIL1 closed 1 month ago
2
rotate + lm_head quantization

#21 RanchiZhao opened 1 month ago
1
rotation+gptq data

#20 Andy0422 opened 1 month ago
7
关于Marlin fetch_to_registers的问题

#19 darrenearl opened 1 month ago
0
bugs: qqq_gemm.cu(183): error: identifier "__hfma2" is undefined

#18 Andy0422 closed 1 month ago
1
Qwen2-1.5B 量化后精度完全不可用

#17 Juelianqvq closed 1 month ago
18
Qwen2-72B-Instruct packing failed

#16 Juelianqvq closed 2 months ago
2
Condition to achieve linear speedup?

#15 jiwonsong-dev opened 2 months ago
18
Qwen2 supported？

#14 Juelianqvq closed 2 months ago
5
Question on rotation

#13 cli99 opened 3 months ago
7
Does QQQ linear support H100?

#12 donglinz closed 2 months ago
1
Plz share some calibration dataset or examples

#11 skykiseki closed 2 months ago
2
关于group_size的问题

#10 darrenearl opened 3 months ago
1
Possibility of using different group size setting

#9 NicoNico6 opened 3 months ago
6
smooth.py报错

#8 darrenearl closed 2 months ago
1
使用QQQ W4A8量化后的模型好像有问题。。。

#7 Zhao-Dongyu closed 3 months ago
2
Can MLA be smoothed?

#6 RanchiZhao closed 4 months ago
5
[New Model Supported] MiniCPM-2.4B

#5 RanchiZhao closed 4 months ago
3
What is the prior for loss/error?

#4 RanchiZhao closed 4 months ago
1
[QST] Speedup of GEMM

#3 Hongbosherlock closed 4 months ago
24
[QST] Scale factors and benchmarks

#2 jeromeku closed 4 months ago
30
How to use custom calib data?

#1 Juelianqvq closed 5 months ago
7