IST-DASLab gptq issues - Githubissues

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

https://arxiv.org/abs/2210.17323

Apache License 2.0

1.95k stars 155 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Problem about value range of valenc in func get_c4()

#59 88099981 opened 2 weeks ago
1
feat: using custom calibration data for quantization

#58 chandraprvkvsh closed 2 weeks ago
0
Reproducing Table3.

#57 mjyun01 opened 1 month ago
0
Test on CNN model containing group conv by GPTQ method

#56 xd1073321804 opened 6 months ago
0
Reconstruct Quantized Model Layer in torch.

#55 puja93 opened 6 months ago
0
Dct 1d v1

#54 igormolybog closed 8 months ago
0
GPTQ on BERT based

#53 zhanyil2 opened 9 months ago
0
AssertionError

#52 GuoYi0 opened 10 months ago
0
Regarding the method for computing the Hessian matrix.

#51 baiSongL opened 11 months ago
1
GPTQ pseudo-quantization saved weights (pt format) How load Re-evaluation

#50 CXiaorong opened 11 months ago
0
GPTQ转化的INT8模型，如何运行呢？请大佬指教

#49 xxm1668 opened 11 months ago
0
Compatibility of Quant3Linear and 4-bit quantization

#48 mynotwo opened 11 months ago
0
act-order on inference

#47 frankxyy opened 12 months ago
0
pack_model takes too long time

#46 westboy123 closed 12 months ago
1
running speed slow on NVIDIA vGPU

#45 foricee opened 1 year ago
0
H_inv not updated

#44 MilesQLi closed 1 year ago
3
About the cuda code, I think "tmp2 >> 30" should be " tmp2 >> 31"

#43 JachinJiang closed 1 year ago
2
Update gptq.py

#42 zzz0906 opened 1 year ago
0
Use modified Cholesky decomposition instead of regularized Cholesky

#41 jiahao opened 1 year ago
0
Why is the wikitext-2 ppl calculated in the code lower than the ppl by lm-evaluation-harness?

#40 Chocolife-96 opened 1 year ago
0
LAMBADA evaluation accuracy

#39 kayhanbehdin opened 1 year ago
0
How should I verify the speedup effect of the algorithm?

#38 moonlightian opened 1 year ago
0
How to run the quantized model for perditions on my prompts?

#37 tarunmcom opened 1 year ago
0
How can we use this lib to quantize Falcon7b / 40b models?

#36 tarunmcom opened 1 year ago
0
How to adopt GPTQ on Conv2d with `groups` attribute?

#35 TMYuan opened 1 year ago
1
PPL results on wikitext/ptb/c4 are worse than the official result

#34 xingyueye opened 1 year ago
2
Can GPTQ models be used for fine-tuning?

#33 siddhsql closed 1 year ago
2
Is there a beginners guide to the GPTQ algorithm?

#32 vgoklani closed 1 year ago
1
The reshape of input_id doesn't match HF OPT model's API

#31 brian-fb closed 1 year ago
1
NVM

#30 brian-fb closed 1 year ago
0
Baptiste Fernandez - Adding my part of the story

#29 fernandezbaptiste closed 1 year ago
0
How to apply 3/4-bit quantization to vision-language model?

#28 verigle closed 1 year ago
1
Question about the difference between the pseudocode and the implementation

#27 RachelXu7 closed 1 year ago
1
GPTQ for BERT

#26 BecomeAllan closed 1 year ago
1
Does GPTQ reduce to OBQ if I set block size to 1?

#25 zxxmxd closed 1 year ago
2
Update gptq.py

#24 Lihengwannafly closed 1 year ago
0
quant_cuda_kernel.cu(212): error: identifier "__hfma2" is undefined

#23 HueCheng1021 opened 1 year ago
1
Can --save work with --groupsize in opt.py?

#22 Frozenmad closed 1 year ago
1
Why no update to Hinv

#21 deciding closed 1 year ago
4
OpenCL Support

#20 apcameron closed 1 year ago
1
About `--sym` zero point

#19 tpoisonooo closed 1 year ago
1
ValueError: not enough values to unpack (expected 2, got 1)

#18 jinz2014 closed 1 year ago
2
quantized GPTJ - error on inference

#17 imthebilliejoe closed 1 year ago
1
Minor fix for llama

#16 Xiuyu-Li closed 1 year ago
1
Conversion of OPT-175B singleton to HF checkpoint

#15 ayeeyecorp closed 1 year ago
0
How to apply 3/4-bit quantization to computer vision models?

#14 zshn25 closed 1 year ago
4
Please comment on why the A100 specific commit makes it faster?

#13 Qubitium closed 1 year ago
2
Title: Feature Request: Add Saving Quantized Weights Functionality to bloom.py

#12 bestpredicts closed 1 year ago
1
License issues

#11 AlpinDale closed 1 year ago
1
Pretrained Weights for Bloom and BloomZ (4-bit)

#10 agemagician closed 1 year ago
1