Question about basis functions used

jiali1025 commented 1 year ago

Your series of studies are excellent! I really enjoy studying them. However, I have a question about this paper:

For these three basis equations. My understanding is you trying to use the spherical solution form of the Schrodinger equation which split the solution into the radial part and the angular part as the inductive bias of the machine learning model. This can ensure the input feature information has certain invariants. However, I have two major questions.

In my understanding we shall have a fixed origin thus a fixed axis system to have such a solution form. However, as shown in the figure the three equations used geometrical information do not seem like traditional quantum physics. They seem like from different origins. For the dimenet works, I think I can find an origin so I can understand. However, for Gemnet it seems difficult to understand.
As you know, such an invariant form of information will be transformed by learnable weight matrices, will the symmetrical form as well as invariant property still be maintained after the transformation? I think this may be destroyed by the transformation, then will it still be that significant to have these kinds of preprocessing?

Thank you so much for such inspiring studies! I am really curious about the above two questions.

Kindly Regards,

Jiali

jiali1025 commented 1 year ago

In addition, a small third question for double confirmation. I think there are no learnable parameters for the basis transformation, the weights are made to be trainable just in order to calculate the force (maintain the differentiation).

gasteigerjo commented 1 year ago

Hi Jiali!

Thank you for your kind words and your interest in our work!

There are models like Cormorant and NequIP that leverage deep connections between these functions and the rotational group to achieve equivariance. However, in GemNet and DimeNet these basis functions are just a mathematical function we use to transform the input information in a helpful way. They do not provide any kind of invariance. They are merely inspired by the derivation from the Schrödinger equation.

Instead, the model's invariance stems from the fact that we only use the system's internal coordinates: Distances and angles between atoms/inter-atom directions (see Figure 1 in the paper). Internal coordinates are invariant to rotations etc. And since these inputs are invariant, the output will be invariant as well, regardless of the learned transformation function. I think this is quite intuitive: The model cannot use some information (the coordinate system) that it has never seen.

This is also a main advantage of this approach compared to e.g. NequIP: We can design arbitrary models with no restrictions imposed by equivariant embeddings.

And yes, there are no learnable parameters in the above equations. All learnable parameters come in later.

I hope my explanation helps! Johannes

TUM-DAML / gemnet_pytorch

Question about basis functions used #13