locuslab / massive-activations

Code accompanying the paper "Massive Activations in Large Language Models"
https://arxiv.org/abs/2402.17762
MIT License
106 stars 8 forks source link

How to get the mean value of massive activation #3

Open pengyao96 opened 6 months ago

pengyao96 commented 6 months ago
  1. How to get the mean value of massive activation?e.g. 2546.8/-1502.0 in hook.py
  2. Mean value is still large, what is the difference between using the mean value and using the original value?
Eric-mingjie commented 6 months ago

Hi, Thanks for you interest in our work.

To get the mean value, we simply evaluate 100 sequences from RedPajama and record the value of massive activations of each sequence.

In practice, we find no performance difference between using the mean value or the original value. But the original value may vary by each sequence, see Table 2, so it might be hard to justify which original value to use.