keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.92k stars 19.45k forks source link

[Question] Graph Machines in keras #4395

Closed jb-delafosse closed 7 years ago

jb-delafosse commented 7 years ago

Hello, I'm not sure if this is the proper place to ask a question about keras. I apologize in advance if it is not.

I'm currently trying to reproduce this publication which is using recurrent neural nets to predict chemical activities.

Goulon, A., T. Picot, A. Duprat, and G. Dreyfus. “Predicting Activities without Computing Descriptors: Graph Machines for QSAR.” SAR and QSAR in Environmental Research 18, no. 1–2 (January 1, 2007): 141–53. doi:10.1080/10629360601054313.

In two words : the autors represent a molecule as a directed acyclic graph. Each node of this graph being an atom and each edge being a chemical bond. Then a neural network is applied on the "exit node" (called root node in the publication). This neural network calls itself on the parents of the root node and so on to provide a prediction on the whole molecule.

I think I understand how the training works for this graph machine: for each molecule, a gradient of the cost function is calculated by backpropagation in the molecule. The gradient over a batch of molecules is a weighted average of the gradient calculated for each molecule composing the batch. A gradient descent algorithm is then used to minimize the cost-function.

My main trouble is : how can I implement this training procedure in Keras ? Usually, you just provide a list of X and a list of corresponding Y and just use model.compile() and then model.fit(). From what I understand, it is not possible here because each molecule should have its own keras.model

Is keras even the best solution here ?

Best regards,

jb-delafosse commented 7 years ago

After some research. The standard name for this architecture is "Recursive neural network" (not recurrent). This architecture is described in the chapter 10 at page 400 of:

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. Chapter 10 http://www.deeplearningbook.org/contents/rnn.html.

Architecture of the recursive Neural Network

Consequentyl, this question is related to #352 It should be noted that in my case, the U and W shown in the previous figure share the same weight and that each node is not forced to be bipartite

coluccigiovanni16 commented 4 years ago

Hello @jbDelafosse i kwon it's late, but i have just implemented the same structure shown in PyTorch, if you want to have a look just see my repo.

jb-delafosse commented 4 years ago

And my response is a bit late too but a very good implementation of this can be found in https://www.dgl.ai/