keisen / tf-keras-vis

Neural network visualization toolkit for tf.keras
https://keisen.github.io/tf-keras-vis-docs/
MIT License
315 stars 45 forks source link

GradCAM does not detect correct convolutional layer in multi-input case #96

Open marieff587 opened 1 year ago

marieff587 commented 1 year ago

Thank you for this useful visualization package!

Right now I have a two-input, one-output model as follows image

I am using GradCAM, with code snippet below. I have stated my penultimate layer to be 'ria-conv', which happens after concatenation of both networks.

image

However, the dimension of cam is 2 - one for each network (15 256 320 is my image count * image dimensions) , as shown in this screenshot image

As far as my understanding goes, since the actual last convolutional layer is after concatenation, cam should be 1 15 256 * 320? This happens both when I explicitly state the penultimate layer and when I state it as -1.

keisen commented 1 year ago

Hi, @marieff587 . Thank you for your good question.

I think that the size of two inputs of your model are the same. Please imagine that they are different. The specification of Gradcam is returning cam images that are the same size as the input images. So, if the model has two or more inputs, Gradcam returns multiple cam images corresponding to each input.

Thanks!