lrjconan / LanczosNetwork

Lanczos Network, Graph Neural Networks, Deep Graph Convolutional Networks, Deep Learning on Graph Structured Data, QM8 Quantum Chemistry Benchmark, ICLR 2019
http://arxiv.org/abs/1901.01484
MIT License
311 stars 66 forks source link

Question: benefit to using distance matrix? #2

Open alexvpickering opened 5 years ago

alexvpickering commented 5 years ago

Hi I noticed that to_graph in dataset/get_qm8_data.py doesn't use the 7th channel (distance matrix). I presume this is related to this line in your paper:

Since some models cannot leverage feature on edges easily, we use the molecule graph itself as the only input information for all models so that it is a fair comparison.

I am wondering if you have tried including the 7th channel (or if it is even valid to do so)? So to_graph would return:

...
# return atom_feat, pair_feat[:, :, :6]
return atom_feat, pair_feat[:, :, :7]

If so, what would you suggest for generating distance matrices for other molecules (not from QM8 dataset)? Is AllChem.EmbedMolecule reasonable? Thank you for making your code available!

lrjconan commented 5 years ago

Hi, Thanks for your interests in our work!

You are right that we do not use the edge feature, e.g., the distance matrix, due to the reason you quoted.

I have not tried adding the 7-th channel. For graph convolutional models, you could augment one more channel to the graph Laplacian as what I did in the code. To construct that channel, you can first use Gaussian kernel to construct the weighted adjacency matrix A[i,j] = exp(d[i,j] / sigma) where d is the distance. One thing to note that here I used the index based on 0. Therefore, pair_feat[:, :, 7] should be the distance matrix.

I am not familiar with this rdkit function AllChem.EmbedMolecule. I guess any sort of similarity measure between atoms should make sense here?

alexvpickering commented 5 years ago

Thank you for your response!

I'll let you know if I do try to adapt your algorithm to include the distance matrix. Cheers,