MinkaiXu / GeoDiff

Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).
MIT License
320 stars 69 forks source link

Why use AddHigherOrderEdges() in sampling but not in training? #10

Closed Layne-Huang closed 1 year ago

Layne-Huang commented 2 years ago

Dear Minkai,

I noticed that you used AddHigherOrderEdges() transformation when preparing dataset in sampling but you did not implement it in traning. Why did you use this transformation and why did not you use it in training process?

Thank you very much!

MinkaiXu commented 2 years ago

Hi Lei,

Actually, the data processing code is just borrowed from ConfGF and there is no specific consideration. I think AddHigherOrderEdges() can also be nicely added to the training process and there should be no problem.

Layne-Huang commented 2 years ago

Hi Lei,

Actually, the data processing code is just borrowed from ConfGF and there is no specific consideration. I think AddHigherOrderEdges() can also be nicely added to the training process and there should be no problem.

Thank you very much for your reply. However, I found AddHigherOrderEdges() is quite important since it cannot generate valid conformations if I did not use it in sampling. Therefore, I think maybe it is closely related to the sampling process. Could you please take a look at this transformation?

Thank you for your attention.

MinkaiXu commented 2 years ago

Yes, you are right this part is important. I just mean, in addition to sampling, you can also add it to training and there should be no problem, except that it will be slower. Higher-order edges can help to fix the full freedom of coordinates, e.g., rotatable bonds and so on, so they are very important for generating the structure.

Layne-Huang commented 2 years ago

Yes, you are right this part is important. I just mean, in addition to sampling, you can also add it to training and there should be no problem, except that it will be slower. Higher-order edges can help to fix the full freedom of coordinates, e.g., rotatable bonds and so on, so they are very important for generating the structure.

Thank you for your kind reply. It really answers my cnfusion.

Hope you have a good day!

Layne-Huang commented 2 years ago

full freedom of coordinates

By the way, why the model cannot generate valid positions if we do not implement AddHigherOrderEdges() ? Does it mean it is necessary to generate coordinates? It is interesting since the training process does not incorporate it. Is there any other methods instead if we did not implement AddHigherOrderEdges()?

MinkaiXu commented 2 years ago

I would say it is important for generating realistic coordinates. You may take a look at Sec 2.1 of this paper (https://arxiv.org/pdf/1909.11459.pdf), which describes why higher-order edges are necessary. As far as I know, all state-of-the-art models take advantage of higher-order edges.