Open baiSongL opened 10 months ago
Hi, this part of the code accumulates the average Hessian iteratively; whether there is a 2 or not depends on the definition of the cost function (if it is 1/2 * squared error or just squared error) and, similarly, whether there is an average or not. Neither of this has any affect on the resulting quantized weights (constant factors cancel out during the algorithm), it just changes the displayed per-layer error value.
I would like to ask about line 61 in your gptq.py file:
inp = math.sqrt(2 / self.nsamples) * inp.float()
. According to the paper, it seems that it should be written as follows:inp = math.sqrt(tmp / self.nsamples) * inp.float()
. After making this modification, I noticed a reduction in quantization error. Could you please verify if my understanding is correct, and if there might be any misunderstanding on my part?