issues
search
IST-DASLab
/
gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.92k
stars
153
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Updated the dataset input code
#58
chandraprvkvsh
closed
1 minute ago
0
Reproducing Table3.
#57
mjyun01
opened
4 weeks ago
0
Test on CNN model containing group conv by GPTQ method
#56
xd1073321804
opened
5 months ago
0
Reconstruct Quantized Model Layer in torch.
#55
puja93
opened
5 months ago
0
Dct 1d v1
#54
igormolybog
closed
7 months ago
0
GPTQ on BERT based
#53
zhanyil2
opened
8 months ago
0
AssertionError
#52
GuoYi0
opened
9 months ago
0
Regarding the method for computing the Hessian matrix.
#51
baiSongL
opened
10 months ago
1
GPTQ pseudo-quantization saved weights (pt format) How load Re-evaluation
#50
CXiaorong
opened
10 months ago
0
GPTQ转化的INT8模型,如何运行呢?请大佬指教
#49
xxm1668
opened
11 months ago
0
Compatibility of Quant3Linear and 4-bit quantization
#48
mynotwo
opened
11 months ago
0
act-order on inference
#47
frankxyy
opened
11 months ago
0
pack_model takes too long time
#46
westboy123
closed
11 months ago
1
running speed slow on NVIDIA vGPU
#45
foricee
opened
12 months ago
0
H_inv not updated
#44
MilesQLi
closed
11 months ago
3
About the cuda code, I think "tmp2 >> 30" should be " tmp2 >> 31"
#43
JachinJiang
closed
1 year ago
2
Update gptq.py
#42
zzz0906
opened
1 year ago
0
Use modified Cholesky decomposition instead of regularized Cholesky
#41
jiahao
opened
1 year ago
0
Why is the wikitext-2 ppl calculated in the code lower than the ppl by lm-evaluation-harness?
#40
Chocolife-96
opened
1 year ago
0
LAMBADA evaluation accuracy
#39
kayhanbehdin
opened
1 year ago
0
How should I verify the speedup effect of the algorithm?
#38
moonlightian
opened
1 year ago
0
How to run the quantized model for perditions on my prompts?
#37
tarunmcom
opened
1 year ago
0
How can we use this lib to quantize Falcon7b / 40b models?
#36
tarunmcom
opened
1 year ago
0
How to adopt GPTQ on Conv2d with `groups` attribute?
#35
TMYuan
opened
1 year ago
1
PPL results on wikitext/ptb/c4 are worse than the official result
#34
xingyueye
opened
1 year ago
2
Can GPTQ models be used for fine-tuning?
#33
siddhsql
closed
1 year ago
2
Is there a beginners guide to the GPTQ algorithm?
#32
vgoklani
closed
1 year ago
1
The reshape of input_id doesn't match HF OPT model's API
#31
brian-fb
closed
1 year ago
1
NVM
#30
brian-fb
closed
1 year ago
0
Baptiste Fernandez - Adding my part of the story
#29
fernandezbaptiste
closed
1 year ago
0
How to apply 3/4-bit quantization to vision-language model?
#28
verigle
closed
1 year ago
1
Question about the difference between the pseudocode and the implementation
#27
RachelXu7
closed
1 year ago
1
GPTQ for BERT
#26
BecomeAllan
closed
1 year ago
1
Does GPTQ reduce to OBQ if I set block size to 1?
#25
zxxmxd
closed
1 year ago
2
Update gptq.py
#24
Lihengwannafly
closed
1 year ago
0
quant_cuda_kernel.cu(212): error: identifier "__hfma2" is undefined
#23
HueCheng1021
opened
1 year ago
1
Can --save work with --groupsize in opt.py?
#22
Frozenmad
closed
1 year ago
1
Why no update to Hinv
#21
deciding
closed
1 year ago
4
OpenCL Support
#20
apcameron
closed
1 year ago
1
About `--sym` zero point
#19
tpoisonooo
closed
1 year ago
1
ValueError: not enough values to unpack (expected 2, got 1)
#18
jinz2014
closed
1 year ago
2
quantized GPTJ - error on inference
#17
imthebilliejoe
closed
1 year ago
1
Minor fix for llama
#16
Xiuyu-Li
closed
1 year ago
1
Conversion of OPT-175B singleton to HF checkpoint
#15
ayeeyecorp
closed
1 year ago
0
How to apply 3/4-bit quantization to computer vision models?
#14
zshn25
closed
1 year ago
4
Please comment on why the A100 specific commit makes it faster?
#13
Qubitium
closed
1 year ago
2
Title: Feature Request: Add Saving Quantized Weights Functionality to bloom.py
#12
bestpredicts
closed
1 year ago
1
License issues
#11
AlpinDale
closed
1 year ago
1
Pretrained Weights for Bloom and BloomZ (4-bit)
#10
agemagician
closed
1 year ago
1
opt_eval error
#9
liangxiaoyun
closed
1 year ago
0
Next