sicara / tf-explain

Interpretability Methods for tf.keras models with Tensorflow 2.x
https://tf-explain.readthedocs.io
MIT License
1.02k stars 111 forks source link

[Grad-CAM] I'm continuously getting this error #116

Closed AntonioMarsella closed 4 years ago

AntonioMarsella commented 4 years ago

The following is the model:

base_model = tf.keras.applications.vgg16.VGG16(weights='imagenet', include_top=False, input_shape = (48,52,3),classes = 11)
inputs = tf.keras.Input(shape=(48,52,3))
x = base_model.get_layer('block1_conv1')(inputs)
x = base_model.get_layer('block1_conv2')(x)
x = base_model.get_layer('block1_pool')(x)
x = base_model.get_layer('block2_conv1')(x)
y = Flatten()(x)
outputs = tf.keras.layers.Dense(11, activation=tf.nn.softmax)(y)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001, decay=0.0002),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

I tried both:

from tf_explain.callbacks.grad_cam import GradCAMCallback

callbacks = [
    GradCAMCallback(
      validation_data=(x_val,y_val),
        class_index=1,
        output_dir='../grad_cam'
    )]

model.fit(x1, y1, batch_size = 32, epochs = 1, verbose = True, callbacks = callbacks)

and also using model1 = tf.keras.models.Model([model.inputs],[model.get_layer('block2_conv1').output, model.output]) but I always get:

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 48, 52, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: ['input_2']

RaphaelMeudec commented 4 years ago

@AntonioMarsella I guess this is in no way related to tf-explain. You can extract those 4 layers from the VGG this way to make it work:

import numpy as np
import tensorflow as tf

base_model = tf.keras.applications.vgg16.VGG16(weights='imagenet', include_top=False, input_shape = (48,52,3),classes = 11)
base_model.summary()
x = base_model.get_layer("block2_conv1").output
y = tf.keras.layers.Flatten()(x)
outputs = tf.keras.layers.Dense(11, activation=tf.nn.softmax)(y)

model = tf.keras.Model(inputs=base_model.inputs, outputs=outputs)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001, decay=0.0002),
              loss="categorical_crossentropy",
              metrics=["accuracy"])
model.summary()

x = np.random.random((10, 48, 52, 3))
y = np.random.random((10, 11))

model.fit(x, y, batch_size = 32, epochs = 1, verbose = True)
mferri17 commented 4 years ago

@RaphaelMeudec I found the same problem while trying to use GradCAM on a pre-trained (and later modified) model. My Graph disconnected error is very similar to the one above.

These StackOverflow answers try to give an explanation of the reason, which seems to be imputable to tf-explain. May you check it out?


In case you need it as an example, my model summary is here (please notice that model_1 layer is a big ResNet embedded into the loaded model) and this is my code:

old_model = tf.keras.models.load_model(model_path)
old_model.trainable = False

y_1, y_2, y_3, y_4 = [old_model.layers[i].output for i in np.arange(-4, 0)] # old regression variables

last_dense = old_model.get_layer('2_dense').output # used to attach new classification variables
y_5 = (Dense(3, activation='softmax', name='x_class'))(last_dense)
y_6 = (Dense(3, activation='softmax', name='y_class'))(last_dense)
y_7 = (Dense(3, activation='softmax', name='z_class'))(last_dense)
y_8 = (Dense(3, activation='softmax', name='yaw_class'))(last_dense)

new_model = Model(inputs=old_model.inputs, outputs=[y_1, y_2, y_3, y_4, y_5, y_6, y_7, y_8])

new_model.compile(loss=list(np.array([['mean_absolute_error'] * 4, ['categorical_crossentropy'] * 4]).flatten()),
              metrics=['mse', 'accuracy'],
              optimizer='adam')

grad_cam = GradCAMCallback(validation_data=(x_valid_feed, y_valid_feed), class_index=0, output_dir=gradcam_folder)

history = new_model.fit(
    x = x_train_feed, 
    y = y_train_feed,
    validation_data = (x_valid_feed, y_valid_feed),
    callbacks = [grad_cam])

I am basically taking the old model and adding 4 new classification outputs to the 4 old regression outputs, so I can apply GradCAM to it. The train actually works when I have no GradCAM callback, but if I have it, it fails after the first epoch with the error: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 60, 108, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: [].