Open Xingrun-Xing opened 4 months ago
I follow the implementation in their code. https://github.com/IST-DASLab/sparsegpt/blob/c3bbf613a1822229767f4d8870b933049b8bef15/sparsegpt.py#L96C21-L96C78
Thanks for your reply. But from OBS/OBC/SparseGPT, we know the delta_loss = w^2/(H_ii), instead of w^2/(H_ii)^2. Do you know why should we use w^2/(H_ii)^2 as the importance metric?
I follow the implementation in their code. https://github.com/IST-DASLab/sparsegpt/blob/c3bbf613a1822229767f4d8870b933049b8bef15/sparsegpt.py#L96C21-L96C78