Closed tohnperfect closed 4 years ago
@tohnperfect: I faced the same problem a few days back. Try this and this worked very well in my case:
Change it to the functional model
x = pre_trained_model.output #(in your case pre_trained model is efficientnet-b3)
global_average_layer = GlobalAveragePooling2D()(x)
dropout_layer_1 = Dropout(0.50)(global_average_layer)
prediction_layer = Dense(1, activation='sigmoid')(dropout_layer_1)
model = Model(inputs= pre_trained_model.input, outputs=prediction_layer)
model.summary()
Now, train the model and use it for GradCAM. I hope it works for you too. Please do let us know if you find any other solution i.e. a way around for Sequential models.
Thank a lot @rao208
This means creating a model in a functional way before training and train it again, right? I will try that and updated here.
I still wonder if the sequential model can be used with TF-explain because I have several trained sequential models to be explored with GradCAM.
@tohnperfect
This means creating a model in a functional way before training and train it again, right?
Yes, this means creating the classifier part i.e. global average pooling layer and dense layer (as written in my reply above) in a functional way (before training) and train your entire model again. I used VGG16 as my pre-trained model with include_top=False
. It is okay when the pre-trained model is a sequential model.
I still wonder if the sequential model can be used with TF-explain because I have several trained sequential models to be explored with GradCAM.
Well, you can use the sequential models with GradCAM, but the problem here is you are using a pre-trained network without the classifier. When you try to connect this pre-trained network with your classifier, the model is viewed as two separate graphs. That is why you are getting this error.
When you build a sequential model from scratch, you won't get 'graph disconnected error'. I have used many sequential models that are built from scratch and it works well with GradCAM (tf_explain)
Honestly, I don't think it will make any difference. The way I understood it is, the functional model is used for more complex architecture for example when you have skip connections like in Resnet and Sequential models are used when you have layers after layers i.e. for simpler architecture. If there is any major difference, then I am not aware of that.
I got it. Thank! @rao208
Thanks @rao208 @tohnperfect
However, can not figure out how to add additional input layers(ex: aug, pre) on bottom of the model that workable with GradCAM().
Here is not fully workable example just for reference.
# MobileNetV2
num_classes = 5
inputs = tf.keras.Input(shape=(224, 224, 3))
aug = data_augmentation(inputs)
pre = preprocess_input(aug)
bm_output = base_model(pre, training=False)
gap2d = tf.keras.layers.GlobalAveragePooling2D()(bm_output)
dro = tf.keras.layers.Dropout(0.2)(gap2d)
outputs = tf.keras.layers.Dense(num_classes)(dro)
model = tf.keras.Model(inputs, outputs, name='model-re-mbnetv2')
This new model was added few bottom layer to base_mode, it can be trained and inference well. But will face the
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_97_2:0", shape=(None, 224, 224, 3), dtype=float32) at layer "input_97". The following previous layers were accessed without issue: []
issue when applied GradCAM()
.
Model: "model-re-mbnetv2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_98 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
sequential_2 (Sequential) (None, 224, 224, 3) 0
_________________________________________________________________
tf_op_layer_RealDiv_17 (Tens (None, 224, 224, 3) 0
_________________________________________________________________
tf_op_layer_Sub_17 (TensorFl (None, 224, 224, 3) 0
_________________________________________________________________
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280) 2257984
_________________________________________________________________
global_average_pooling2d_39 (None, 1280) 0
_________________________________________________________________
dropout_15 (Dropout) (None, 1280) 0
_________________________________________________________________
dense_48 (Dense) (None, 5) 6405
=================================================================
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
_________________________________________________________________
# ResNet50
num_classes = 5
bm_output = base_model(inputs, training=False)
gap2d = tf.keras.layers.GlobalAveragePooling2D()(base_model.output)
outputs = tf.keras.layers.Dense(num_classes)(gap2d)
model = tf.keras.Model(base_model.input, outputs, name='model-re-resnet50')
This Resnet50 is fine to train/inference and the CAM() because its not to include additional input layers.
Model: "model-re-resnet50"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_85 (InputLayer) [(None, 224, 224, 3) 0
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 input_85[0][0]
__________________________________________________________________________________________________
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 conv1_pad[0][0]
__________________________________________________________________________________________________
conv1_bn (BatchNormalization) (None, 112, 112, 64) 256 conv1_conv[0][0]
__________________________________________________________________________________________________
conv1_relu (Activation) (None, 112, 112, 64) 0 conv1_bn[0][0]
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 conv1_relu[0][0]
__________________________________________________________________________________________________
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 pool1_pad[0][0]
__________________________________________________________________________________________________
...
...
...
conv5_block3_3_conv (Conv2D) (None, 7, 7, 2048) 1050624 conv5_block3_2_relu[0][0]
__________________________________________________________________________________________________
conv5_block3_3_bn (BatchNormali (None, 7, 7, 2048) 8192 conv5_block3_3_conv[0][0]
__________________________________________________________________________________________________
conv5_block3_add (Add) (None, 7, 7, 2048) 0 conv5_block2_out[0][0]
conv5_block3_3_bn[0][0]
__________________________________________________________________________________________________
conv5_block3_out (Activation) (None, 7, 7, 2048) 0 conv5_block3_add[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_26 (Gl (None, 2048) 0 conv5_block3_out[0][0]
__________________________________________________________________________________________________
dense_35 (Dense) (None, 5) 10245 global_average_pooling2d_26[0][0]
==================================================================================================
Total params: 23,597,957
Trainable params: 10,245
Non-trainable params: 23,587,712
@vscv There are some techniques online to add the additional input layers to the pretrained model (example: https://stackoverflow.com/questions/59695637/i-am-trying-to-merge-2-pretrained-keras-model-but-failed or https://stackoverflow.com/questions/40755914/prepending-downsample-layer-to-resnet50-pretrained-model)
I tried following as suggested in the answer on Stackoverflow. If your end goal is to implement GradCam on layers of the pretrained model, then you cannot do so. If you see the output in
you will see that one cannot extract the layers of pretrained models and hence GradCAM cannot be implemented on the new model.
If you find any solution, please post it here. I tried looking everywhere online, but could not figure it out how to add additional input layer and extract the layers of pretrained model.
Hope this answers your question :)
@rao208 Thanks for the hit. I revised the statement of problem.
@rao208 @vscv did you find any solution?
@rao208 @vscv did you find any solution?
The problem has not been resolved. My current alternative is to put preprocess and augmentation into tf.data.map instead of adding it to the base_model.
I have a trained sequential model which composes of a pre-trained headless efficient net and the final layers. The model.summary() look as follows,
My efficientnet-b3 model looks like,
I tried to use core API GradCAM for the trained model as follows,
which output this error,
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 150, 150, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []
Please note that the model prediction works fine.
Thank you for your help!