keisen / tf-keras-vis

Neural network visualization toolkit for tf.keras
https://keisen.github.io/tf-keras-vis-docs/
MIT License
311 stars 45 forks source link

Graph disconected: custom model with two inputs #48

Open SergioG-M opened 3 years ago

SergioG-M commented 3 years ago

Hi, I'm getting an error when I try to get the grad-cam visualizations for a custom model with two inputs. A minimal working example follows

img_width=224
img_height=224

model = EfficientNetB0(include_top=False, input_shape=(img_width, img_height, 3),
                   weights='imagenet', drop_connect_rate=0.2)

input1 = layers.Input(shape=(img_width, img_height, 3))
input2 = layers.Input(shape=(img_width, img_height, 3))

output1 = model(input1)
output2 = model(input2)

conc = layers.Concatenate(axis=-1, name='Concat_features')([output1, output2])
# Rebuild last layer
x = layers.GlobalAveragePooling2D(name="avg_pool")(conc)
x = layers.BatchNormalization()(x)

top_dropout_rate = 0.2
x = layers.Dropout(top_dropout_rate, name="top_dropout")(x)
outputs = layers.Dense(self.num_classes, activation="softmax", name="predictions")(x)

dual_model = Model([input1, input2], outputs)
optimizer = optimizers.Adam(learning_rate=1e-4)

dual_model.compile(optimizer=optimizer, loss="categorical_crossentropy",
              metrics=["accuracy"])    

Then when I try to call to GradCam I get:

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 224, 224, 3), dtype=float32) at layer "rescaling". The following previous layers were accessed without issue: []

Any idea how can i solve this?

keisen commented 3 years ago

Hi, @SergioG-M , Sorry for late reply.

The cause of the problem is that your model is cascaded, that is, the model includes EfficientNet models. When tf.keras.Model is used as Layer, Tensorflow Graph is disconnected between the model of top level and included model. But GradCAM (and other viualization methods) need the graph that continuously connect from the input layer to the output layer of the model.

So you need to slightly devise to build such model. Please modify the code below ...

model = EfficientNetB0(include_top=False, input_shape=(img_width, img_height, 3),
                   weights='imagenet', drop_connect_rate=0.2)

input1 = layers.Input(shape=(img_width, img_height, 3))
input2 = layers.Input(shape=(img_width, img_height, 3))

output1 = model(input1)
output2 = model(input2)

... to like below:

input1 = layers.Input(shape=(img_width, img_height, 3))
input2 = layers.Input(shape=(img_width, img_height, 3))

model1 = EfficientNetB0(include_top=False, input_tensor=input1,
                        weights='imagenet', drop_connect_rate=0.2)
model2 = EfficientNetB0(include_top=False, input_tensor=input2,
                        weights='imagenet', drop_connect_rate=0.2)

output1 = model2.output
output2 = model2.output

Then, please try to GradCAM!

Thanks!

nadezhda95 commented 10 months ago

Hi! I also used transfer learning and your suggestion in the previous comment helped me to solve the issue. But I noticed, that I get the same error of disconnection when I use data augmentation layers

def build_model():

    input_shape = (300, 150, 3)
    inputs = tf.keras.layers.Input(shape=input_shape)
    inputs = tf.keras.layers.RandomRotation(0.15)(inputs)
    inputs = tf.keras.layers.RandomFlip()(inputs)
    inputs = tf.keras.layers.RandomZoom(0.15)(inputs)

    global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
    prediction_layer = tf.keras.layers.Dense(1, activation='sigmoid')
    base_model = tf.keras.applications.EfficientNetV2S(input_shape=input_shape,
                                                        include_top=False,
                                                        weights='imagenet',
                                                        input_tensor=inputs)
    base_model.trainable = False

    bm_output = base_model.output

    x = tf.keras.layers.Dense(units=512, activation='relu', use_bias=False)(bm_output)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Dropout(0.5)(x)

    x = global_average_layer(x)

    outputs = prediction_layer(x)

    model = keras.Model(inputs, outputs)
    model.compile(
        optimizer=tf.keras.optimizers.Adam(),
        loss=tf.keras.losses.BinaryCrossentropy(),
        metrics='accuracy'
    )

    return model

augmentation_model = Sequential(
    [
        tf.keras.layers.RandomRotation(factor=0.15),
        tf.keras.layers.RandomFlip(),
        tf.keras.layers.RandomZoom(0.15)
    ]
)

The function builds model, but layerCam doesn't work if I apply data processing layers