atomicarchitects / equiformer_v2

[ICLR 2024] EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
https://arxiv.org/abs/2306.12059
MIT License
213 stars 26 forks source link

Question about the edge_rot_mat #17

Open HeegerGao opened 2 months ago

HeegerGao commented 2 months ago

Hi, thanks for your wonderful work and code! I am new to the equivariant learning area, and I am trying to understand each step of your code. I suspect this function (the function for calculating the edge rotation matrix) would break the equivariance during the edge-degree embedding layer.

In this function, you set the original edge_distance_vec as the final x_axis, and randomly select the y-axis and generate the z_axis according to the other two axes. I think this random operation is not correct.

For example, suppose I send two same molecules into the network, where the coordinates of one molecule are translated relative to the other molecule, but not rotated. If the network is SE(3)-equivariant, the final feature should be the same for these two molecules. However, during the above function, they will have different edge rotation matrices on the same edge since the randomly selected y_axis, and the corresponding Wigner-D matrix would be different, thus during the edge-degree embedding layer, the embedding of the two molecules are different since this step uses the Wigner-D matrix. This breaks the equivariance for all features with type>0.

I am not sure if I am correct. Looking forward to your opinion on this issue.

HeegerGao commented 2 months ago

Hi, I have tried a toy example, and found that the edge-degree embedding is OK, but the final results are not SO(3)-rotational equivaraint. It seems some small errors continue to accumulate after each TransBlockV2 (this step), and the reason is also from the SO(3)_Rotation class, i.e., the Wigner-D matrix.

yilunliao commented 2 months ago

Hi @HeegerGao

The rotation (and inverse rotation after SO(2) linear layers) make everything SO(3)-equivariant. You can check the papers of eSCN and EquiformerV2 for more details.

There are many other things in the TransBlockV2. For example, the S2 activation would introduce some small equivariance differences.

Since you mentioned the edge-degree embedding, containing the rotation, is equivariant, I think the equivariance errors do not come from the rotation (or Wigner-D matrices).

Can you please check if there is something that contributes to the small equivariance errors you observed (for example, S2 activation)?