ardigen / MAT

The official implementation of the Molecule Attention Transformer.
MIT License
234 stars 57 forks source link

How is invariance to order of atoms in the molecule achieved? #7

Closed tsjain closed 4 years ago

tsjain commented 4 years ago

Hi,

Thanks for the really nice and well explained paper.

I had a question regarding how the prediction output is invariant to the order of the atoms in the molecule. One can randomly permute the order of atoms in both the adjacency matrix, distance matrix as well as the atom feature matrix.

Will the MAT give the same property prediction for the different permutations?

My understanding is that the learned Attention is between positions so it is not permutation invariant. In the NLP uses of the Transformer, there is a positional encoding term added which helps with learning distant context, but unlike in language tasks, the order of the atoms in a molecule can be specified quite arbitrarily.

Thanks.

Mazzza commented 4 years ago

Hello,

All our layers and operations are permutation invariant. Hence, changing the order does not impact the output.

We do not use positional encoding as in NLP Transformer. Instead, information about the structure of the molecule is given to the model by the adjacency and distance matrices. Thanks to this, our Molecule Self-Attention is permutation invariant.