Closed jschrier closed 2 years ago
Thanks for the feedback! The tensor description is the more general and important for understanding molecular graphs. The simplification later on is because a GCN cannot accommodate an adjacency tensor. I did not know about that rdkit function, thanks for pointing it out!
I'll add a bit more reasoning about why/when we want to know what kind of bonds there vs. just where there are covalent bonds.
Fixed by a1a7626
In the beginning of the Graph Neural Network section ,
smiles2graph
(Section 8.2) defines anadj
output as a rank 3 tensor, where the last index describes the bond order types.This last axis gets summed over in Section 8.4.1 to form the
adj_mat
that gets used in subsequent graph NN calculations.Then in Section 8.5, the
gen_smiles2graph
function defines theadj
output as a rank 2 tenor from the start.My question here is: What's the point of defining the first case to be a rank 3 tensor if that result never gets used? Is it just so that students have to practice an np.sum?
Suggestion: Why not make this the same? This would minimize any differences between the two versions to just be the change in the node descriptions.
Suggestion 2: In fact, why not just replace the adjacency matrix determination with
adj = Chem.GetAdjacencyMatrix(m)
(+ identity) in both cases, which would shave off a few lines of code by using the built-in functionality?