akensert / molgraph

Graph neural networks for molecular machine learning. Implemented and compatible with TensorFlow and Keras.
https://molgraph.readthedocs.io/en/latest/
MIT License
48 stars 5 forks source link

Do you have a directed graph implementation like DiGIN or ChemProp ? #3

Closed thegodone closed 1 year ago

thegodone commented 1 year ago

I am looking for the Directed GIN / Directed MPNN version :

https://www.mdpi.com/1420-3049/26/20/6185

thanks

GG

akensert commented 1 year ago

Hi @thegodone. Thanks for the feedback. I'm looking into this now and will try to implement a Directed MPNN, by modifying the MPNN model that already exist (in MolGraph). It is more difficult (less trivial) than I first thought as for each edge a->b I need to check if a<-b exist, and if so, subtract that from the sum (cf. equation 2. in your paper). Note that I try to work with what I have and prefer not to change the GraphTensor too much.

And I should also mention that the layers and models that currently exist (in MolGraph) work well with directed graphs too. I don't see any issues as least. That said, the directed MPNN and directed GIN are somewhat different from MPNN and GIN as they uses message associated with directed edges instead of vertices (to prevent 'totters'). See e.g. this paper. So I would like to implement them too (at least something similar to the directed MPNN variant in the paper I referenced).

[EDIT] Given the following graph (or GraphTensor):

GraphTensor(..., edge_dst=[0, 0, 1, ...], edge_src=[1, 2, 0, ...])

My approach is to compute a message (cf. equation 2 in your paper): (1) m_{0->1} = h_{1->0} + h_{2->0} (tf.segment_sum based on edge_dst followed by tf.gather based on edgesrc) and then (2) `m{0->1} -= h{1->0}(this is the difficult step where I need to find the appropriateh{k->v}to subtract (in this caseh_{1->0})). On paper it seems like a strange approach, but implementation-wise it makes sense to me; as I don't know how I can construct edge_dst and edge_src to excludeh_{1->0}` to begin with.

Feel free to give further feedback on this if needed. :)

thegodone commented 1 year ago

Hi Alexander, this is the correct implementation of summing on source edge following by removing the destination edge contribution.

is it possible to have a layer too like MPNNConv ? and a true model example like ESOL please ? I want to make benchmarks versus KGCNN on performance of the models.

akensert commented 1 year ago

Hi @thegodone. Thanks. (The directed MPNN/GIN implementations should correspond to what I wrote above.)

Regarding implementation of a layer (and example): I will look into it.

thegodone commented 1 year ago

Did you also use "directed edge" data in order to change the data generation process from undirected into only directed ?

Le ven. 7 avr. 2023 à 11:59, Alexander Kensert @.***> a écrit :

Hi @thegodone https://github.com/thegodone. Thanks. (The directed MPNN/GIN implementations should correspond to what I wrote above.)

Regarding implementation of a layer: I will look into it.

— Reply to this email directly, view it on GitHub https://github.com/akensert/molgraph/issues/3#issuecomment-1500137670, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJBWYUE5P62LZ5VH5I7I7DW77QPNANCNFSM6AAAAAAVSC7M3A . You are receiving this because you were mentioned.Message ID: @.***>

akensert commented 1 year ago

Not sure what you mean exactly (sorry). There is currently no module that generates directed graphs. MolecularGraphEncoder generates (small) molecular graphs which are undirected; or you can actually think of them as directed graphs wherein edges u -> w = w -> u.

And the existing gnn layers can handle directed graphs. For instance, you can read one of the well known citation dataset (note: no existing module in molgraph to read those datasets in) and then fit a gnn model to them.

The GraphTensor which is produced by MolecularGraphEncoder looks something like this (let's assume you are encoding the molecule CC and bond type is a onehot encoding [single, double, triple]):

print(graph_tensor.edge_dst)
print(graph_tensor.edge_src)
print(graph_tensor.edge_feature)
>>> tf.Tensor([0 1], shape=(2,), dtype=int32)
>>> tf.Tensor([1 0], shape=(2,), dtype=int32)
>>> tf.Tensor(
>>>    [[1. 0. 0.]
>>>    [[1. 0. 0.]], shape=(2, 3), dtype=float32)

So even though there is only one bond, two edges exist (one in each direction); in the case of molecules they should be the same.

Btw, I'm happy to start to think about directed graph problems. So far I've been focusing mostly (almost exclusively) on undirected graphs.

(Apologies if I'm completely off the track here, and I'm not answering your question at all.)