issues
search
IST-DASLab
/
gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.95k
stars
155
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Problem about value range of valenc in func get_c4()
#59
88099981
opened
2 weeks ago
1
feat: using custom calibration data for quantization
#58
chandraprvkvsh
closed
2 weeks ago
0
Reproducing Table3.
#57
mjyun01
opened
1 month ago
0
Test on CNN model containing group conv by GPTQ method
#56
xd1073321804
opened
6 months ago
0
Reconstruct Quantized Model Layer in torch.
#55
puja93
opened
6 months ago
0
Dct 1d v1
#54
igormolybog
closed
8 months ago
0
GPTQ on BERT based
#53
zhanyil2
opened
9 months ago
0
AssertionError
#52
GuoYi0
opened
10 months ago
0
Regarding the method for computing the Hessian matrix.
#51
baiSongL
opened
11 months ago
1
GPTQ pseudo-quantization saved weights (pt format) How load Re-evaluation
#50
CXiaorong
opened
11 months ago
0
GPTQ转化的INT8模型,如何运行呢?请大佬指教
#49
xxm1668
opened
11 months ago
0
Compatibility of Quant3Linear and 4-bit quantization
#48
mynotwo
opened
11 months ago
0
act-order on inference
#47
frankxyy
opened
12 months ago
0
pack_model takes too long time
#46
westboy123
closed
12 months ago
1
running speed slow on NVIDIA vGPU
#45
foricee
opened
1 year ago
0
H_inv not updated
#44
MilesQLi
closed
1 year ago
3
About the cuda code, I think "tmp2 >> 30" should be " tmp2 >> 31"
#43
JachinJiang
closed
1 year ago
2
Update gptq.py
#42
zzz0906
opened
1 year ago
0
Use modified Cholesky decomposition instead of regularized Cholesky
#41
jiahao
opened
1 year ago
0
Why is the wikitext-2 ppl calculated in the code lower than the ppl by lm-evaluation-harness?
#40
Chocolife-96
opened
1 year ago
0
LAMBADA evaluation accuracy
#39
kayhanbehdin
opened
1 year ago
0
How should I verify the speedup effect of the algorithm?
#38
moonlightian
opened
1 year ago
0
How to run the quantized model for perditions on my prompts?
#37
tarunmcom
opened
1 year ago
0
How can we use this lib to quantize Falcon7b / 40b models?
#36
tarunmcom
opened
1 year ago
0
How to adopt GPTQ on Conv2d with `groups` attribute?
#35
TMYuan
opened
1 year ago
1
PPL results on wikitext/ptb/c4 are worse than the official result
#34
xingyueye
opened
1 year ago
2
Can GPTQ models be used for fine-tuning?
#33
siddhsql
closed
1 year ago
2
Is there a beginners guide to the GPTQ algorithm?
#32
vgoklani
closed
1 year ago
1
The reshape of input_id doesn't match HF OPT model's API
#31
brian-fb
closed
1 year ago
1
NVM
#30
brian-fb
closed
1 year ago
0
Baptiste Fernandez - Adding my part of the story
#29
fernandezbaptiste
closed
1 year ago
0
How to apply 3/4-bit quantization to vision-language model?
#28
verigle
closed
1 year ago
1
Question about the difference between the pseudocode and the implementation
#27
RachelXu7
closed
1 year ago
1
GPTQ for BERT
#26
BecomeAllan
closed
1 year ago
1
Does GPTQ reduce to OBQ if I set block size to 1?
#25
zxxmxd
closed
1 year ago
2
Update gptq.py
#24
Lihengwannafly
closed
1 year ago
0
quant_cuda_kernel.cu(212): error: identifier "__hfma2" is undefined
#23
HueCheng1021
opened
1 year ago
1
Can --save work with --groupsize in opt.py?
#22
Frozenmad
closed
1 year ago
1
Why no update to Hinv
#21
deciding
closed
1 year ago
4
OpenCL Support
#20
apcameron
closed
1 year ago
1
About `--sym` zero point
#19
tpoisonooo
closed
1 year ago
1
ValueError: not enough values to unpack (expected 2, got 1)
#18
jinz2014
closed
1 year ago
2
quantized GPTJ - error on inference
#17
imthebilliejoe
closed
1 year ago
1
Minor fix for llama
#16
Xiuyu-Li
closed
1 year ago
1
Conversion of OPT-175B singleton to HF checkpoint
#15
ayeeyecorp
closed
1 year ago
0
How to apply 3/4-bit quantization to computer vision models?
#14
zshn25
closed
1 year ago
4
Please comment on why the A100 specific commit makes it faster?
#13
Qubitium
closed
1 year ago
2
Title: Feature Request: Add Saving Quantized Weights Functionality to bloom.py
#12
bestpredicts
closed
1 year ago
1
License issues
#11
AlpinDale
closed
1 year ago
1
Pretrained Weights for Bloom and BloomZ (4-bit)
#10
agemagician
closed
1 year ago
1
Next