albermax / innvestigate

A toolbox to iNNvestigate neural networks' predictions!
Other
1.25k stars 233 forks source link

AvgPooling to Conv3D conversion (Canonizaton) #213

Closed rachtibat closed 3 years ago

rachtibat commented 4 years ago

conv

Hi (:

Das AvgPooling2D Layer kann mit einem Conv2D Layer dargestellt werden. Sagen wir der Input ist ein RGB Bild mit 28283 Dimension. Dann ist der Kernel des Conv2D layers z.b. ein 2x2 Filter wie in der Gif im Anhang. Was passiert ist, dass Keras beim definieren des Conv2D layers aus dem Kernel ein 2x2x3x3 Filter macht, da wir 3 Input und Output Channel haben. Der Filter läuft zweidimensional über das Bild, addiert aber alle drei Channel gleichzeitig zusammen.

Um das zum AvgPool2D Layer zu machen, muss ein Conv2D mit Kernel (2x2), 3 Output Channel und stride = (2,2) definiert werden:

Also drei Filter insgesamt für jeden Output Channel/Filter definieren wir: Kernel1[:,:,0,0] = 1/2*2, Kernel1[:,:,1,1] = 0, Kernel1[:,:,2,2] = 0 -> averaged nur den ersten Input Channel

Kernel2[:,:,0,0] = 0, Kernel2[:,:,1,1] = 1/22, Kernel2[:,:,2,2] = 0 -> averaged nur den zweiten Input Channel Kernel3[:,:,0,0] = 0, Kernel3[:,:,1,1] = 0, Kernel3[:,:,2,2] = 1/22 -> averaged nur den dritten Input Channel

Als Ergebnis erhalten wir aus dem 28x28x3 Bild ein 14x14x3 Output. Und das habe ich bereits implementiert und es funktioniert experimentell.

LRP kann das Layer einfach als normales Kernel Layer behandeln. Und soweit funktionniert es auch. Das Problem besteht nur in der Flat Rule: Dort werden alle Kernel weights auf 1 gesetzt! D.h. Kernel1[:,:,0,0] = 1, Kernel1[:,:,1,1] = 1, Kernel1[:,:,2,2] = 1 Was eine Verfälschung der ursprünglichen Operation ist. Denn hier werden Relevanzen des 1. Output Channels auch auf den zweiten und dritten Input Channel verteilt!

ein Conv3D layer umgeht das Problem, indem man ein Input shape von (28,28,3,1) für das gegebene Input Bild definiert und die Filter (2,2,1), was in weights von (2,2,1,1) resultiert. Dann kommt 14x14x3x1 raus. Einfach mal bei Jupyter Notebook ausprobieren (;

VG

albermax commented 4 years ago

Hi Reduan,

Sorry for the late reply. It would be great to keep the discussions in English. :-)

I think the main confusion in our discussion came from that I thought we are talking about plain (global) average pooling layers.

To recap, so basically the issue is that the kernel in the conv2d sets all values to 1 which does not reflect the actual operation.

Would the problem not be solved if the flat rule would reuse the kernel from the forward pass (plus maybe adjusting the scaling), which does not "leak relevance" to the other channels?

Cheers, Max

rachtibat commented 3 years ago

Hi Max,

yes we could change the rules instead of implementing a canonization module. And in fact, I think it is just a matter of taste. But I believe it is more "elegant" to implement a canonization module.

If you look for instance at the z-rule for the AvgPool2D:


class AveragePoolingReverseRule(reverse_map.ReplacementLayer):
    """Special AveragePooling handler that applies the Z-Rule"""

    def __init__(self, layer, *args, **kwargs):
        self._layer_wo_act = kgraph.copy_layer_wo_activation(layer,
                                                             name_template="no_act_%s")
        super(AveragePoolingReverseRule, self).__init__(layer, *args, **kwargs)

    def wrap_hook(self, ins, neuron_selection, stop_mapping_at_layers):
        with tf.GradientTape(persistent=True) as tape:
            tape.watch(ins)
            outs = self.layer_func(ins)
            Zs = self._layer_wo_act(ins)

            # check if final layer (i.e., no next layers)
            if len(self.layer_next) == 0 or self.name in stop_mapping_at_layers:
                outs = self._neuron_select(outs, neuron_selection)
                Zs = self._neuron_select(Zs, neuron_selection)

        return outs, Zs, tape

    def explain_hook(self, ins, reversed_outs, args):

        if len(self.input_shape) > 1:
            raise ValueError("This Layer should only have one input!")

        # the outputs of the pooling operation at each location is the sum of its inputs.
        # the forward message must be known in this case, and are the inputs for each pooling thing.
        # the gradient is 1 for each output-to-input connection, which corresponds to the "weights"
        # of the layer. It should thus be sufficient to reweight the relevances and and do a gradient_wrt

        uts, Zs, tape = args
        # last layer
        if reversed_outs is None:
            reversed_outs = Zs

        # Divide incoming relevance by the activations.
        if len(self.layer_next) > 1:
            tmp = [ilayers.SafeDivide()([r, Zs]) for r in reversed_outs]
            # Propagate the relevance to input neurons
            # using the gradient.
            tmp2 = [tape.gradient(Zs, ins, output_gradients=t) for t in tmp]
            ret = keras_layers.Add()([keras_layers.Multiply()([ins, t]) for t in tmp2])
        else:
            tmp = ilayers.SafeDivide()([reversed_outs, Zs])
            # Propagate the relevance to input neurons
            # using the gradient.
            tmp2 = tape.gradient(Zs, ins, output_gradients=tmp)
            ret = keras_layers.Multiply()([ins, tmp2])

        return ret

It is a looot a code. And then we have to implement it for every single rule. Of course, there might be some "tricks" with which we could minimize the effort, but let me show you an alternative:

Instead, we could try to generalize LRP. LRP for one layer can be calculated with the gradient: x_input dz/dx R/z If we manage to find a suitable "dz/dx" for every layer, we could generalize this simple formula to all other rules.

I found a much easier way to canonize the AvgPool2D layer. We define our own keras layer, where the kernel behaves as we wish:

class LRP_AveragePooling2D(keras.layers.Layer):

    def __init__(self):
        super(LRP_AveragePooling2D, self).__init__()

        # define mutable kernel that will be expanded to a final "filter_kernel" in runtime
        self.np_kernel = np.ones(shape=(2, 2)) * 1 / (2 * 2)
        self.kernel = tf.Variable(self.np_kernel, dtype=tf.float32, trainable=False)

    def build(self, input_shape):

        # assume channel dimension is last
        self.filter_shape = (2, 2, input_shape[3], input_shape[3])

        # initialize final filter kernel as numpy array
        self.np_filter_kernel = np.zeros(filter_shape, dtype=np.float32)

    def call(self, input):
        #calculate final filter kernel here, so that the self.kernel can be changed in runtime

        # loop through channels and set kernel weights
        for c in range(self.filter_shape[-1]):
            self.np_filter_kernel[:, :, c, c] = self.kernel.numpy()

        filter_kernel = tf.constant(self.np_filter_kernel)

        return tf.nn.conv2d(input, filter_kernel, 2, padding="VALID", data_format='NHWC')

The forward and backward pass are correct. It is much simpler and easier for us to implement. And most important: works for all rules!

Best wishes