akensert / molgraph

Graph neural networks for molecular machine learning. Implemented and compatible with TensorFlow and Keras.
https://molgraph.readthedocs.io/en/latest/
MIT License
48 stars 5 forks source link

Outputting saliency maps #22

Closed vcanogil closed 1 year ago

vcanogil commented 1 year ago

Not certain that if I did something wrong when training the model but once saved and loaded, for the gradient activation mapping to compute one needs to invoke it using compute_saliency function. Otherwise it gives a value error.

gam_model = GradientActivationMapping(
    model,
    [i.name for i in model.layers if "conv" in i.name],
    output_activation=None,
    discard_negative_values=True
)

gam = gam_model.compute_saliency(X, None)

Which is different from how it does in the docs, i.e.

gam_model = GradientActivationMapping(
    sequential_model,
    ['conv_1', 'conv_2', 'conv_3', 'conv_4'],
    output_activation=None,
    discard_negative_values=False,
)

gam = gam_model(x_data, verbose=1)
print(gam[0])
akensert commented 1 year ago

Hi @vcanogil , thanks for the feedback! I will look into this.

Could you supply me with just enough code that raises an error when not invoking model.compute_saliency? The intention is that one should not need to invoke model.computesaliency, and that loading a model and immediately calling the model (via __call_\) should work.

vcanogil commented 1 year ago

Sure, here is what I've used:

import tensorflow as tf 
import numpy as np 
from molgraph import features, Featurizer, MolecularGraphEncoder

atom_encoder = Featurizer([
    features.Symbol(),
    features.TotalNumHs(),
    features.ChiralCenter(),
    features.Aromatic(),
    features.Ring(),
    features.Hetero(),
    features.HydrogenDonor(),
    features.HydrogenAcceptor(),
    features.CIPCode(),
    features.RingSize(),
    features.GasteigerCharge()
])

bond_encoder = Featurizer([

    features.BondType(),
    features.Conjugated(),
    features.Rotatable(),
    features.Ring(),
    features.Stereo(),
])

encoder = MolecularGraphEncoder(atom_encoder, bond_encoder)

# some model I trained and saved via model.save (i.e. Keras)
model = tf.keras.models.load_model(...)
graph = encoder(np.array(["CCC", "CCO"])) # a random example

gam_model = GradientActivationMapping(
    model,
    [i.name for i in model.layers if "conv" in i.name], # this is also arbitrary, it could be whichever layer
    output_activation=None,
    discard_negative_values=True
)

gam = gam_model(graph, None) # doesn't matter if you pass the second argument or not. 

At which point it should give you a ValueError that starts with

Could not find matching concrete function to call loaded from the SavedModel ...

After which it basically says that you're passing the wrong input.

akensert commented 1 year ago

Okay thanks for the info, I'll look into it!

It should be possible to save the gam_model, load it and use it; though it seems that loading a encoder model, passing it to GradientActivationMapping and then use it throws the error.

Previously GradientActivationMapping was implemented as keras.Model, but I stumbled upon some issues and decided to make it a tf.Module instead. It also does not make much sense to use a keras.Model as one simply just wants to map a graph to some output (i.e. "fit" and "evaluate" is not used).

akensert commented 1 year ago

@vcanogil which version do you use? The code you supplied works for me (0.5.7):

import tensorflow as tf 
import numpy as np 
from molgraph.chemistry import features, Featurizer, MolecularGraphEncoder
from molgraph import layers
from molgraph import models

atom_encoder = Featurizer([
    features.Symbol(),
    features.TotalNumHs(),
    features.ChiralCenter(),
    features.Aromatic(),
    features.Ring(),
    features.Hetero(),
    features.HydrogenDonor(),
    features.HydrogenAcceptor(),
    features.CIPCode(),
    features.RingSize(),
    features.GasteigerCharge()
])

bond_encoder = Featurizer([
    features.BondType(),
    features.Conjugated(),
    features.Rotatable(),
    features.Ring(),
    features.Stereo(),
])

encoder = MolecularGraphEncoder(atom_encoder, bond_encoder)

graph = encoder(np.array(["CCC", "CCO"]))

model = tf.keras.Sequential([
    tf.keras.layers.Input(type_spec=graph.unspecific_spec),
    layers.GCNConv(128, name='conv_1'),
    layers.GCNConv(128, name='conv_2'),
    layers.Readout(),
    tf.keras.layers.Dense(1)
])

model.compile('adam', 'mse')
model.fit(graph, tf.constant([1., 2.]))

model.save('/tmp/my_model')

model = tf.keras.models.load_model('/tmp/my_model')

gam_model = models.GradientActivationMapping(
    model,
    ['conv_1', 'conv_2'],
    output_activation=None,
    discard_negative_values=True
)

gam = gam_model(graph, None) 
akensert commented 1 year ago

Btw, there is possibly a bug in from_config() of the base GNN layer. I will fix this now (in 0.5.8). It could be the issue for you, but not sure. Futhermore, to avoid future bugs/breaks in user code, I will soon migrate the GraphTensor to the (relatively new) tf.experimental.ExtensionType API (in 0.6.0). This migration will possibly break some user code, but I think it is worth it. It is better/preferred to use TF's public APIs instead of internal TF modules that aren't really supposed to be used by us.

vcanogil commented 1 year ago

Sorry for getting back late.

It looks like this issue only occurred for me in version 0.5.7. Everything works as intended 0.5.8. Still not sure why, but it's OK now.

Thank for looking into it!

vcanogil commented 1 year ago

I have found that the bug is reproducible for me if you try to load a model that has been saved using an older version of keras or tensorflow.

I was able to reproduce the bug by loading a model that was using keras 2.9.0 and tensorflow 2.10.0, where if the versions are 2.12.0 for each, it works as intended.

So seems like something in the TF API that causes the bug.