GradCam Issues with MobileNetV3

dlin511 commented 3 years ago

Hi @keisen, first off, thank you all so much for the amazing library.

I was recently using GradCam++ with MobileNetV3 and found that the heatmaps created were entirely blue despite the predictions being accurate. I am currently reading through the relevant literature to figure out whether it is a feature of the MobileNetV3 architecture or some lack of compatibility with the visualization code. I was just wondering if you guys would have any insight to perhaps resolve the issue more quickly.

Here is some code to recreate the issue (adapted mostly from your tutorial):

import tensorflow as tf
from tf_keras_vis.utils.scores import CategoricalScore
from matplotlib import cm
from tf_keras_vis.gradcam import GradcamPlusPlus
from tensorflow.keras.preprocessing.image import load_img

mobilenet = tf.keras.applications.MobileNetV3Large(
    input_shape=(224,224,3), include_top=True,
    weights='imagenet', classes=1000)

# Random goldfish or bear images
img1 = load_img("goldfish.jpeg", target_size=(224,224))
img2 = load_img("bear.jpeg", target_size=(224,224))

# Cast to tf.float32 to watch gradients
# No additional preprocessing is necessary for mobilenetv3 as the architecture takes care of it
# Source: https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v3/preprocess_input
images = tf.cast(np.asarray([np.array(img1), np.array(img2)]), tf.float32)

def model_modifier(cloned_model):
    cloned_model.layers[-1].activation = tf.keras.activations.linear
    return cloned_model

score = CategoricalScore([1, 294])

gradcam = GradcamPlusPlus(mobilenet,
                  model_modifier=model_modifier,
                  clone=False)

cam = gradcam(score, images, penultimate_layer=-1)

image_titles = ['Goldfish', 'Bear']
f, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 4))
for i, title in enumerate(image_titles):
    heatmap = np.uint8(cm.jet(cam[i])[..., :3] * 255)
    ax[i].set_title(title, fontsize=16)
    ax[i].imshow(images[i]/255)
    ax[i].imshow(heatmap, cmap='jet', alpha=0.5) # overlay
    ax[i].axis('off')
plt.tight_layout()
plt.show()

keisen commented 3 years ago

@dlin511, Thank you for letting us know!

There are 2 causes for the problem.

1. When `penultimate_layer=-1` with MobilenNetV3, the spatial size of the class activation map (CAM) that is visualized by Gradcam++ is 1x1.

Gradcam++(, Gradcam and Scorecam) resize the visualized CAM to the same with input image size as follows.

https://github.com/keisen/tf-keras-vis/blob/4b7779532e425acf843cfc7e5ceea40a8dc3bcca/tf_keras_vis/gradcam.py#L259-L261

So when the CAM size is 1x1, the whole image generated by Gradcam++ will be one color. To avoid this, we can modify the penultimate_layer option to 'Conv_2' or earlier layer name.

cam = gradcam(..., penultimate_layer='Conv_2')

2. The negative and positive of CAM is reversed. (The cause of blue)

I don't know why it is. For now, we can get the CAM we expect if we reverse the sign of cam such below:

cam = gradcam(..., activation_modifier=lambda cam: K.relu(-cam))

Ultimately, you can avoid the problem to modify the code of calling Gradcam++ such below:

cam = gradcam(score, images, penultimate_layer='Conv_2', activation_modifier=lambda cam: K.relu(-cam))

Thanks!

dlin511 commented 3 years ago

Thank you so much!! I appreciate your quick response, you're amazing.

keisen commented 3 years ago

You're welcome. Please star this repository if you'd like!

srwi commented 2 years ago

@dlin511 Using Conv_2 as the penultimate layer can cause inverted gradcam activations. Instead multiply_18 should be used which is the last subsequent layer with the same output shape as Conv_2. This applies to TF 2.4. The MobileNetV3 implementation has changed slightly in later versions.

keisen / tf-keras-vis