Is the adjacency matrix fully connected when it is first input to the GCPNet++?

BioinfoMachineLearning / bio-diffusion

A geometry-complete diffusion generative model (GCDM) for 3D molecule generation and optimization (Nature CommsChem)

Other

178 stars 24 forks source link

Is the adjacency matrix fully connected when it is first input to the GCPNet++? #8

Closed 18hfliu closed 5 months ago

18hfliu commented 6 months ago

Just like this link (https://github.com/BioinfoMachineLearning/bio-diffusion/blob/775a9760972f6ec254c44dd6e623c4451403ca4f/src/models/components/gcpnet.py#L1095-L1099) , I looked at the batch.edge_index variable. It seems to represent the adjacency matrix of the molecule, but it is fully connected. This does not use the atomic bond information in the molecule, which will cause the message passing mechanism of the graph model to not fully play its role. .

amorehead commented 6 months ago

Hello. Fully-connected message passing is to be expected here, as for the task of 3D molecule generation, until the final diffusion timestep (i.e., 0), one does not fully know what the molecular bond graph should look like. Instead, once the final 3D molecule is generated, the bond graph is inferred from the 3D coordinates themselves (reference: EDM, GeoLDM, etc). This is what we refer to as "implicit bond prediction" for 3D molecule generation, as opposed to "explicit bond prediction" methods that directly generate 3D coordinates and bond (edge) labels.

amorehead commented 6 months ago

Moreover, in our ablation experiments, we find that employing a fully-connected adjacency matrix is crucial for achieving a high percentage of valid and stable 3D molecules. Intuitively, this is because fully-connected message passing is approximating something akin to self-attention over the input graphs, which I believe highlights the importance of global attention for this problem. In other words, when we want to generate new molecules, it is safe to assume that any pair of atoms could be a bond, until we finalize the 3D molecule's geometry (at which point we can infer the final bond graph).

amorehead commented 5 months ago

If you have any follow-up questions regarding this point, please let me know. Otherwise, I will be closing this issue.

18hfliu commented 5 months ago

Sorry, I was on vacation recently and didn't have the chance to ask this question. I probably understand what you mean. Are you thinking of a graph network that can use transformer?

amorehead commented 5 months ago

In a sense, yes, but not necessarily. In our experiments, we find that a lightweight version of (sigmoid-gated) attention (on the fully-connected graph edges) is crucial for achieving a high degree of stable 3D molecules across contexts and datasets. You can imagine Transformers being good for this as well, although they might incur a bit more computational cost in terms of increasing the number of parameters in one's network.