Closed pfebrer closed 4 years ago
@pfebrer96 Both SchNet and MEGNet (I highlight these only because DimeNet benchmarks against them and both have code available on Github not due to any association) have periodic BCs for looking at crystalline materials and so it should also be possible to the same here I am also interested in such applications and so have been looking at the code to see how it might be done.
Wow, thank you very much! I didn't notice that. I'm going to try to understand how do they do it then :)
Great to hear from you both!
Last summer we did some small experiments on materials using the periodic BCs you mentioned, but then decided to focus on small molecules so we're not spread out too thinly. From what I remember some of the settings like the cutoff (which is one of the most important hyperparameters in general) need to be set differently for periodic materials, but in general it seemed to work. I can't report anything specific and we haven't worked on that direction since, though.
Ok, thanks! I'm trying to understand how the input data is generated and structured to get a sense of how this should be done.
This may be obvious but, when you were experimenting with periodic materials, did you add extra atoms with their positions R
or did you modify how you calculate edges to account for the periodic conditions?
For example, in DataContainer
, you calculate distances like this
https://github.com/klicperajo/dimenet/blob/bf725c33755cd6fb87661fe03956b5fb30889742/dimenet/training/data_container.py#L69
and then apply the cutoff. This obviously only finds distances within the unit cell of the material, so it seems that you would need to add extra atoms to "fake" a periodicity. I'm still lacking deep understanding of how the model works: is it possible/does it make sense to calculate edges also based on the periodic images of the atoms/nodes? That is, in my drawing, you would have two distances between atom 1 and 4: the distance inside the unit cell, and the distance between periodic images. Only the second one would "survive" to the cutoff.
Thanks!
I don't think you want a cutoff that is so short it will remove the neighbor inside the unit cell. We've used a cutoff of 5A for small molecules, so this can even include third-hop neighbors in the molecular graph. We didn't spend too much time investigating periodicity, so I can't give you any exact hints. The fact that these two atoms would be connected in 2 different ways might be problematic, but you'd have to test it yourself. I think you can have duplicate indices in the index lists we use, so that should work. You just need to make sure that you consistently calculate distances and angles.
I don't think that adding fake atoms is a good idea, since every atom needs to consistently update its embeddings, which seems problematic with fake atoms.
Great, thanks for the comments! I will keep them in mind.
Should I close this?
You're welcome!
Hi, very nice work on this! :)
I've been exploring the ML/deep learning landscape to find some inspiration for cool ideas that would be nice to play with during my PhD in materials science. I've seen lots of implementations of deep learning for molecules, but not so much for periodic structures such as crystals.
I would like to know if you have given any thought on how periodic conditions could work in a GNN and specifically in DimeNet. Maybe you have already implemented it and I have failed to found it (in that case, excuse me). I have some intuition about it, but I would like to know your thoughts about it, if it's not too much to ask.
From what I understood in your paper, the information about the atoms/nodes positions is only "stored" at the bonds/edges, encoded as the angles and bond lengths. Is this right? If so, my intuition is that, given a periodic system like this one:
you can say that, in the left border, atom 1 is effectively connected to atom 4 through a connection that is in the direction of bond 8 in this drawing. Then, in my naive view, this should account fully for the periodicity of the system, because atom 4 contains the information of the rest of the structure and a kind of loop will be created there.
I'd like to know if you think that this would make sense and if not, I would appreciate if you could share the reasons why this won't work.
Thanks in advance!