Open lsjlsj5846 opened 2 months ago
Hi, @lsjlsj5846 Have you successfully reproduce dthe results when take GPTQ as weight quantizer?
I also successfully get similar results with paper for Llama2-7B, 13B, 70B, and Llama-3 8B when take RTN as the weight quantizer.
However, the GPTQ results I obtained even worse than RTN.
Hi, @ChenMnZ Yes, I got GPTQ results similar to the paper, except for Llama3-70B. Did you use W16A4KV4 rotation matrices?
@lsjlsj5846 I used the W4A4KV4 pretrained rotation matrices before.(https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8?usp=sharing).
Thanks for your reminder, I will give a try with W16A4KV4 rotation matrix.
@lsjlsj5846 I meet the same problem with RTN Llama3-70B W4A4KV4.
Hi, @ChenMnZ I also got GPTQ results that were different from the paper.
./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4
I also used the W16A4KV4 rotation matrix that was given. google drive
Here's what I reproduced. Task | Version | Metric | Value | Stderr | In paper | |
---|---|---|---|---|---|---|
arc_easy | 0 | acc | 0.6540 | ± | 0.0098 | 72.6 |
acc_norm | 0.5198 | ± | 0.0103 | |||
arc_challenge | 0 | acc | 0.3703 | ± | 0.0141 | 47.5 |
acc_norm | 0.3891 | ± | 0.0142 |
There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.
Have you encountered the same problem as me? I look forward to discussing it with you. Thank you.
Hi, @ChenMnZ I also got GPTQ results that were different from the paper.
./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4
I also used the W16A4KV4 rotation matrix that was given. google drive
Here's what I reproduced.
Task Version Metric Value Stderr In paper arc_easy 0 acc 0.6540 ± 0.0098 72.6 acc_norm 0.5198 ± 0.0103 arc_challenge 0 acc 0.3703 ± 0.0141 47.5 acc_norm 0.3891 ± 0.0142 There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.
Have you encountered the same problem as me? I look forward to discussing it with you. Thank you.
I also agree with this overfitting. Maybe SpinQuant is more like to LoRA, which tries to fitting downstream tasks.
Hello,
I tried to reproduce the results of the paper, and got similar results for Llama2-7B, 13B, 70B, and Llama-3 8B. However, when I tested Llama3-70B using the optimized rotation matrix you provided [link], the result of RTN was as follows:
I also found out that GPTQ results of Llama3-70B differ from what you reported. (I used W4A4KV4 rotation matrix for RTN, and W16A4KV4 rotation matrix for GPTQ.) I guess the provided rotation matrices for Llama3-70B is somehow wrong. Could you check this issue, and provide the right rotation matrix for Llama3-70B if possible?
Thank you.