IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

H_inv not updated #44

Closed MilesQLi closed 8 months ago

MilesQLi commented 8 months ago

After each quantization step, H_inv should be updated, but in the code fasterquant, H_inv is not updated. Is it a bug?

efrantar commented 8 months ago

Hi, these updates are all precomputed in advance by the Cholesky decomposition, see also "Step 3: Cholesky Reformulation" in our paper.

MilesQLi commented 8 months ago

Step 3 in the paper is not specific: with no formula to show how Cholesky decomposition is used to update (5) and (4). That makes this step very hard to understand. Is there a clearer description of this part?

MilesQLi commented 8 months ago

I did the derivation myself.