Vahe1994 / SpQR

Apache License 2.0
515 stars 40 forks source link

Doesn't seem to work for Baichuan-7B #33

Closed CPegasus closed 11 months ago

CPegasus commented 11 months ago

Hi, @Vahe1994. It is so kind of you to release such a great work! I had applied the SpQR with Baichuan-7B (whose network structure is the same as LlaMa-7B, except that the number of tokens in embedding & lm_head layer is twice that of LlaMa-7B) and found that the outlier_threshold should be tuned quite high (i.e., 3.0) to achieve the fraction of outliers (nearly 1%) recommended by your paper. However, after tuning that, the average score on C-Eval val set was dropped drastically, specifically, from 38.5 to 23.0 (following the official evaluated script and used zero-shot). I will be very appreciated that if you could help us to understand this issue. Thank you so much!