facebookresearch / SpinQuant

Code repo for the paper "SpinQuant LLM quantization with learned rotations"
Other
164 stars 14 forks source link

Question about the optimized rotation matrix for Llama3-70B #11

Open lsjlsj5846 opened 2 months ago

lsjlsj5846 commented 2 months ago

Hello,

I tried to reproduce the results of the paper, and got similar results for Llama2-7B, 13B, 70B, and Llama-3 8B. However, when I tested Llama3-70B using the optimized rotation matrix you provided [link], the result of RTN was as follows:

Wikitext-2 PPL paper-reported Mine diff.
Llama3-70B 4.1 7.5821 3.4821

I also found out that GPTQ results of Llama3-70B differ from what you reported. (I used W4A4KV4 rotation matrix for RTN, and W16A4KV4 rotation matrix for GPTQ.) I guess the provided rotation matrices for Llama3-70B is somehow wrong. Could you check this issue, and provide the right rotation matrix for Llama3-70B if possible?

Thank you.

ChenMnZ commented 2 months ago

Hi, @lsjlsj5846 Have you successfully reproduce dthe results when take GPTQ as weight quantizer?

I also successfully get similar results with paper for Llama2-7B, 13B, 70B, and Llama-3 8B when take RTN as the weight quantizer.

However, the GPTQ results I obtained even worse than RTN.

lsjlsj5846 commented 2 months ago

Hi, @ChenMnZ Yes, I got GPTQ results similar to the paper, except for Llama3-70B. Did you use W16A4KV4 rotation matrices?

ChenMnZ commented 2 months ago

@lsjlsj5846 I used the W4A4KV4 pretrained rotation matrices before.(https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8?usp=sharing).

Thanks for your reminder, I will give a try with W16A4KV4 rotation matrix.

ChenMnZ commented 2 months ago

@lsjlsj5846 I meet the same problem with RTN Llama3-70B W4A4KV4.

cokeshao commented 2 months ago

Hi, @ChenMnZ I also got GPTQ results that were different from the paper.

./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4

I also used the W16A4KV4 rotation matrix that was given. google drive

Here's what I reproduced. Task Version Metric Value   Stderr In paper
arc_easy 0 acc 0.6540 ± 0.0098 72.6
    acc_norm 0.5198 ± 0.0103  
arc_challenge 0 acc 0.3703 ± 0.0141 47.5
    acc_norm 0.3891 ± 0.0142  

There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.

Have you encountered the same problem as me? I look forward to discussing it with you. Thank you.

JingyangXiang commented 6 days ago

Hi, @ChenMnZ I also got GPTQ results that were different from the paper.

./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4

I also used the W16A4KV4 rotation matrix that was given. google drive

Here's what I reproduced.

Task Version Metric Value   Stderr In paper arc_easy 0 acc 0.6540 ± 0.0098 72.6     acc_norm 0.5198 ± 0.0103   arc_challenge 0 acc 0.3703 ± 0.0141 47.5     acc_norm 0.3891 ± 0.0142   There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.

Have you encountered the same problem as me? I look forward to discussing it with you. Thank you.

I also agree with this overfitting. Maybe SpinQuant is more like to LoRA, which tries to fitting downstream tasks.