Non E(3) equivariant model issue

yangxiufengsia commented 1 year ago

Hi, nice paper published. Retarding to your E(3) model, I think your model might be not translational equivariant. I have similar questions with @oriondollar. I think it might be better to open a new issue here.

(1) In your model, normalized absolute coordinates were directly used as input vector features. So, I am wondering your model might not work for other test data having different absolute coordinates. As you mentioned in @oriondollar solution, that we can translate the center of pocket mass to the origin of coordinate. I want to ask could this solve the non-translational equivariance perfectly?

(2) I noticed you used normalization factor=20 to normalize the input absolute coordinate vectors, do you have any purposes to choose 20?

(3) Similar to the question (1), For vertex vectors, I noticed you used normalized absolute coordinates directly, Could this guarantee the equivariance? If we use different normalized absolute coordinates as vertex vectors, will your model work?

(4) You also mentioned that the embedding layer has minor effect on the translational equivariance and the model still learns the translational equivariance since the substraction of vectors remain unchanged. But from my test, it doesn't work if I used a different protein-ligand data outside training data coordinate space.

Do I misunderstand something? Do you have any ideas to solve this issue? Looking forward to your reply. Thank you in advance.

pengxingang commented 1 year ago

Hi, thanks for your insightful questions. Here are my answers. (1) Theoretically, translating to the origin can ensure the translational equivariance of the model. This strategy is also applied to other related works, such as GeoDiff and TargetDiff. I think this is a simple yet effective strategy to make an O3 equivariant model translational equivariant. (2) The value was chosen to make the normalized vector features comparable to the scalar feature. (3) As I explained in the previous issue, using the coordinates as the initial vector features seemed not born to be equivariant. But the model can learn the subtraction of two vector features to be equivariant. According to our evaluation, the pockets in the test set do not have the same coordinates as those in the training set but the model still worked for it. So we think the model can sample for pockets with different absolute coordinates. (4) Most samples are coordinated around the origin so the original model with weak translational equivariance can work for them. But if the samples are moved far away, a strict equivariance is required. To solve it, we can modify the initial vector features, such as using all-zero initial vectors or relative positions as other related works do.

yangxiufengsia commented 1 year ago

Hi, thank you very much for detailed and valuable answers.
(1) Thanks for sharing the paper information. I will check. (2) I see. So, for different input absolute coordinates, the normalization_factor should be changed. (3) I see.

(4) I still have some questions about the ideas your provided:

Using all-zero initial vectors to replace initial absolute coordinates is interesting. But I am wondering that the zero vector might destroy global geometric features that could be learned by your model, since we changed global location/position of each node to zero and the model might not able to learn the correct directions of next atoms. As I tried, the model could work for different translations, but the performance degrades and difficult to close rings. How do you think?
You also suggested using relative positions as vertex vector features. But I am still confusing that how should we define good relative positions, since each node could have several neighbours. Do you have any good idea for that?
BTW, do you have any plans to further improve your current implementation to strict E(3) model in the future ?

Again, thank you so much.

pengxingang commented 1 year ago

I did not mean using all-zero as the initial absolute coordinates. The initial atom coordinates are not changed. I just mean that only the vector features are initialized as all zeros.
A simple strategy is that the initial vector features of a vertex are calculated from the summation (of other pooling) of the relative vectors pointing from all other vertices to this vertex. Then the following update is the same as before. In this way, the initial vector features are equivariant.
You can check the paper PaiNN for more information about using vector features. In that paper, they used all-zero as vector initialization. I think it is a good idea to improve Pocket2Mol to a strict E(3) model for better generalization, using strategies discussed above or other equivariant models. But currently we do not have a plan to do that.

yangxiufengsia commented 1 year ago

Thank you so much for your valuable solutions, which inspires me a lot. I would like to try these methods you mentioned, also looking forward to your updated model in the future.

pengxingang / Pocket2Mol

Non E(3) equivariant model issue #23