HandH1998 / QQQ

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
https://arxiv.org/pdf/2406.09904
68 stars 7 forks source link

rotation+gptq data #20

Open Andy0422 opened 1 week ago

Andy0422 commented 1 week ago

Hi,

Can you share the rotation+gptq ppl data? is it better than smoothquant+gptq? Many tks!

HandH1998 commented 1 week ago

Ref to https://github.com/HandH1998/QQQ/issues/13#issuecomment-2319955934. In my practice, rotation+gptq is generally better than smooth+gptq for per-channel quantization. However, this is not the case for some models, such as https://github.com/HandH1998/QQQ/issues/17.

Andy0422 commented 6 days ago

@HandH1998

Hi,thank you for your kindly help. I encountered another problem with the calibration data,

from my test result as following, the results with wikitext2 seems ok, and the results with pile calib dataset is not aligned with your original data. The pile data I used in from https://huggingface.co/datasets/mit-han-lab/pile-val-backup/tree/main, could share your pile dataset for me? or share your comments on this finding. email: wangdawei_0422@163.com.

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

Granularity | Method | Llama-2 | Wikitext2 | Pile | paper data -- | -- | -- | -- | -- | -- per-channel | smooth+gptq | 7B | 5.98 | 6.14 | 5.95 per-group | smooth+gptq |   | 5.71 | 5.78 | 5.71

HandH1998 commented 5 days ago

@Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing and gptq. So it is normal that you cannot reprocude the results of our paper using the latest code. It is not relevant with the pile data.

Andy0422 commented 5 days ago

@Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing and gptq. So it is normal that you cannot reprocude the results of our paper using the latest code. It is not relevant with the pile data. @HandH1998 okay, see... So do you think our test results is correct ? Thank you!

HandH1998 commented 5 days ago

@Andy0422 It is probably correct.