facebookresearch / SpinQuant

Code repo for the paper "SpinQuant LLM quantization with learned rotations"
Other
171 stars 16 forks source link

A question regarding the rotation matching pairs #7

Open Menace-Dragon opened 3 months ago

Menace-Dragon commented 3 months ago

SpinQuant is a subsequent work to QuaRot. However, we have noticed that the definitions of the rotation matrix pairing details differ between the two papers. In QuaRot, first, there is an online Hadamard operation (with a dimension of head_dim) before o_proj. Secondly, o_weight is fused with a Hadamard matrix ( H ) of the entire tensor dimension, as highlighted in the red box in the figure below.

image

In SpinQuant, the online Hadamard operation before o_proj is removed. Additionally, o_weight is fused with a Hadamard matrix ( H, now i.e., image) of head_dim dimension.

image

Why is this the case? Are the rotation matrix pairings in the two papers equivalent?

aiwhz commented 1 week ago

Hi @Menace-Dragon,

There are different, 'this transformation R2 has now been applied head-wise to the weight matrices, and results in computed activations (emitted by the block multi-head attention) rotated head-wise.'

In the QuaRot, it has done further ratation to "To complete a “full” Hadamard operation on the attention-activations".

image

Thanks.