LRP for text classification - DeepExplain context and attribution sum

Hi all and thank you very much Marcoancona for providing your implementation to explain NNs! It's very valuable. I am currently working on text classification and I would like to understand which words contributed to the decision of my classifier. As there is no NLP example in this project, I followed your pseudocode and guidelines and I wrote the following code to classify quotations extracted from 4 UK newspapers into the original news sources. In this sample dataset there are only 500 quotes in total and 4 classes (newspapers). I uploaded the data as numpy arrays here. The data is already preprocessed, tokenized, transformed into vectors and padded.

Given the code I share below, my question is why do I get the "You might have forgot to (re)create your graph within the DeepExlain context" warning, even though I reconstruct the model in the deepExplain context? As I am a relative beginner in Keras, I am also unsure whether the code inside the DeepExplain corresponds to the pseudocode you provided. Lastly, I didn't get how to find the attributions per word as you described it. I am not sure what to sum and also how to find the initial words (not the vectors) that the attributions correspond to. I appreciate any hint! Thanks a lot PS. The model performs poorly, but it's just a toy example to get familiar with DeepExplain and Keras

# data processing
import numpy as np
import keras
# classification
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, Flatten
import tensorflow as tf

# get data (already split and preproccessed)
# https://drive.google.com/file/d/19Fil-e8x20n_bP9Art8H0spbuO53i1Yx/view?usp=sharing
X_train = np.loadtxt('quotes_X_train.txt', dtype=int);
X_test = np.loadtxt('quotes_X_test.txt', dtype=int);
y_train = np.loadtxt('quotes_y_train.txt', dtype=int);
y_test = np.loadtxt('quotes_y_test.txt', dtype=int);

# build MLP
model = Sequential();
model.add(Embedding(input_dim=4218+1, output_dim=32, input_length=100));
model.add(Flatten());
model.add(Dense(100, activation='relu'));
model.add(Dropout(0.5));
model.add(Dense(4, activation='softmax'));
model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy']);
print(model.summary());

# fit and predict
model.fit(X_train, y_train,
          batch_size=32,
          epochs=5,
          validation_data=(X_test, y_test),
          verbose=1,
          shuffle=True);
y_pred = model.predict(np.array(X_test));
y_test = np.array(y_test);

# try to explain the model
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tempfile, sys, os
sys.path.insert(0, os.path.abspath('..'))

# Import DeepExplain
from deepexplain.tensorflow import DeepExplain

current_session = K.get_session();

with DeepExplain(session=current_session) as de:  # <-- init DeepExplain context

    # Get input tensor
    input_tensor = model.layers[0].input;
    print("input_tensor --- {}".format(input_tensor));
    # Get embedding tensor
    embedding_tensor = model.layers[0].output;
    print("embedding_tensor --- {}".format(embedding_tensor));
    # Get tensor before the final activation
    pre_softmax_tensor = model.layers[-1].output;
    print("pre_softmax_tensor --- {} ".format(pre_softmax_tensor));
    # Create model until before softmax
    fModel = Model(inputs=input_tensor, outputs = model.layers[-1].output)

    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test})

    xs = X_test
    ys = y_test

    # Run DeepExplain with the embedding as input
    print("\nCalling deep explain soon ....\n");
    print("pre_softmax_tensor * ys shape --- {}".format((pre_softmax_tensor * ys).shape));
    print("embedding_tensor shape --- {}".format(embedding_tensor.shape));
    print("embedding_out shape --- {}\n".format(embedding_out.shape));
    attributions = de.explain('elrp', pre_softmax_tensor * ys, embedding_tensor, embedding_out)
    print("attributions shape --- {}".format(attributions.shape));

Hello, the warning happens because you call de.explain() with input and output tensors of the original model instead of those of fModel. I would try (not tested) with:

attributions = de.explain('elrp', fModel.outputs[0] * ys, fModel.inputs[0], embedding_out)

However you also need to change how fModel is defined because its input layer has to be embedding_tensor. You might need to compile the model as well (call fModel.compile() with the sames params of model). One more important thing, not related with the issue: you are taking the output of the softmax, not the one pre-softmax. This is because model.layers[-1] is the last layer: Dense(class_number, activation='softmax'), which includes the activation. If you want the output pre-softmax you need to split your last layer of the model into two layers, then pick the second last:

model.add(Dense(class_number, activation='linear'));
model.add(Activation('softmax'));
...
pre_softmax_tensor = model.layers[-2].output

Please let me know if this helps.

Thank you for your help! I changed the model like this:

model = Sequential();
....
#model.add(Dense(4, activation='softmax'));
model.add(Dense(4, activation='linear'));
model.add(Activation('softmax'));
model.compile(....);

and I get the pre softmax reference like this now:

pre_softmax_tensor = model.layers[-2].output;

So far so good. When I create the new model with the embedding tensor though, I get an error that the input I gave to this model is not of an appropriate type:

#fModel = Model(inputs=input_tensor, outputs = model.layers[-2].output);
fModel = Model(inputs=embedding_tensor, outputs = model.layers[-2].output);

TypeError: Input layers to a Model must be InputLayer objects. Received inputs: Tensor("embedding_6/Gather:0", shape=(?, 100, 32), dtype=float32). Input 0 (0-based) originates from layer type Embedding

and this is how the sensors look like:

input_tensor --- Tensor("embedding_6_input:0", shape=(?, 100), dtype=int32) embedding_tensor --- Tensor("embedding_6/Gather:0", shape=(?, 100, 32), dtype=float32) pre_softmax_tensor --- Tensor("dense_10/BiasAdd:0", shape=(?, 4), dtype=float32)

Regarding the explain() call, I changed the tensor according to your suggestions:

new_pre_softmax_tensor = fModel.outputs[0]; # not same with fModel.layers[-2].output;
new_input_tensor = fModel.layers[0].input; # same tensor with fModel.inputs[0]
print("new_pre_softmax_tensor: {}".format(new_pre_softmax_tensor));
print("new_input_tensor: {}".format(new_input_tensor));

new_pre_softmax_tensor: Tensor("dense_10/BiasAdd:0", shape=(?, 4), dtype=float32) new_input_tensor: Tensor("embedding_6_input:0", shape=(?, 100), dtype=int32)

and unfortunately when I use them, I get a ValueError:

#attributions = de.explain('elrp', pre_softmax_tensor * ys, embedding_tensor, embedding_out)
attributions = de.explain('elrp', new_pre_softmax_tensor * ys, new_input_tensor, embedding_out);

ValueError: Dimensions must be equal, but are 100 and 4 for 'mul_188' (op: 'Mul') with input shapes: [?,100], [96,4].

and these are the shapes that are responsible for the ValueError:

pre_softmax_tensor ys shape --- (96, 4) new_pre_softmax_tensor ys shape --- (96, 4) embedding_tensor shape --- (?, 100, 32) embedding_out shape --- (96, 100, 32) attributions shape --- (96, 100, 32)

Right, the input to a Keras model cannot be a TF Tensor. So what about defining fModel as follows:

fModel = Model(inputs= model.inputs, outputs = model.layers[-2].output);

and using

new_input_tensor = fModel.layers[0].output # < notice I take the output of first layer, so now output of the embedding
attributions = de.explain('elrp', new_pre_softmax_tensor * ys, new_input_tensor, embedding_out);

In other words, define a model with the same input of the original model and with output the pre-softmax layer, and then call explain using the output of the first layer (ie. the embedded representation of the input) as input tensor for DeepExplain.

If you try this change first, then we can look at the shape mismatch problem.

I just included the changes and I get a ValueError (None values not supported) when I call explain().

That's the current deep explain code:

with DeepExplain(session=current_session) as de:  # <-- init DeepExplain context

    # Get input tensor
    input_tensor = model.layers[0].input;
    print("input_tensor --- {}".format(input_tensor));
    # Get embedding tensor
    embedding_tensor = model.layers[0].output;
    print("embedding_tensor --- {}".format(embedding_tensor));
    # Get tensor before the final activation
    pre_softmax_tensor = model.layers[-2].output;
    print("pre_softmax_tensor --- {} ".format(pre_softmax_tensor));
    # Create model until before softmax
    fModel = Model(inputs= model.inputs, outputs = model.layers[-2].output);

    fModel.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy']);
    print(fModel.summary());

    new_pre_softmax_tensor = fModel.outputs[0]; # not same with fModel.layers[-2].output;
    new_input_tensor = fModel.layers[0].output # < notice I take the output of first layer, so now output of the embedding

    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test});

    xs = X_test;
    ys = y_test;

    # Run DeepExplain with the embedding as input
    attributions = de.explain('elrp', new_pre_softmax_tensor * ys, new_input_tensor, embedding_out);
    print("attributions shape --- {}".format(attributions.shape));

and that's the error (which looks similar to this):


ValueError                                Traceback (most recent call last)
<ipython-input-11-a809f4474611> in <module>()
     55     #attributions = de.explain('elrp', pre_softmax_tensor * ys, embedding_tensor, embedding_out)
     56     #attributions = de.explain('elrp', new_pre_softmax_tensor * ys, new_input_tensor, embedding_out);
---> 57     attributions = de.explain('elrp', new_pre_softmax_tensor * ys, new_input_tensor, embedding_out);
     58     print("attributions shape --- {}".format(attributions.shape));
     59 

~/projects/nn-models/deepexplain-copy/deepexplain/tensorflow/methods.py in explain(self, method, T, X, xs, **kwargs)
    455         _ENABLED_METHOD_CLASS = method_class
    456         method = _ENABLED_METHOD_CLASS(T, X, xs, self.session, self.keras_phase_placeholder, **kwargs)
--> 457         result = method.run()
    458         if issubclass(_ENABLED_METHOD_CLASS, GradientBasedMethod) and _GRAD_OVERRIDE_CHECKFLAG == 0:
    459             warnings.warn('DeepExplain detected you are trying to use an attribution method that requires '

~/projects/nn-models/deepexplain-copy/deepexplain/tensorflow/methods.py in run(self)
    122 
    123     def run(self):
--> 124         attributions = self.get_symbolic_attribution()
    125         results =  self.session_run(attributions, self.xs)
    126         return results[0] if not self.has_multiple_inputs else results

~/projects/nn-models/deepexplain-copy/deepexplain/tensorflow/methods.py in get_symbolic_attribution(self)
    244         return [g * x for g, x in zip(
    245             tf.gradients(self.T, self.X),
--> 246             self.X if self.has_multiple_inputs else [self.X])]
    247 
    248     @classmethod

~/projects/nn-models/deepexplain-copy/deepexplain/tensorflow/methods.py in <listcomp>(.0)
    242 
    243     def get_symbolic_attribution(self):
--> 244         return [g * x for g, x in zip(
    245             tf.gradients(self.T, self.X),
    246             self.X if self.has_multiple_inputs else [self.X])]

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py in r_binary_op_wrapper(y, x)
    907   def r_binary_op_wrapper(y, x):
    908     with ops.name_scope(None, op_name, [x, y]) as name:
--> 909       x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
    910       return func(x, y, name=name)
    911 

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, preferred_dtype)
    834       name=name,
    835       preferred_dtype=preferred_dtype,
--> 836       as_ref=False)
    837 
    838 

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx)
    924 
    925     if ret is None:
--> 926       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    927 
    928     if ret is NotImplemented:

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)
    227                                          as_ref=False):
    228   _ = as_ref
--> 229   return constant(v, dtype=dtype, name=name)
    230 
    231 

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape)
    206   tensor_value.tensor.CopyFrom(
    207       tensor_util.make_tensor_proto(
--> 208           value, dtype=dtype, shape=shape, verify_shape=verify_shape))
    209   dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
    210   const_tensor = g.create_op(

~/anaconda3/envs/nn-models-env/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape)
    369   else:
    370     if values is None:
--> 371       raise ValueError("None values not supported.")
    372     # if dtype is provided, forces numpy array to be the type
    373     # provided if possible.

ValueError: None values not supported.

Hello, I feel I need to try myself, cause I cannot immediately see the problem. Could you please send me a minimal working model together with some data to try it out?

Thanks in advance for the help! My first post contains all information. here is the link to the python notebook I use. The code is more or less the same as above and it contains the fixes you suggested. Here is the dataset.

I also had some difficulty recreating the graph correctly in the DeepExplain context. I will have to think about it, in the meanwhile I suggest to use a single model, create and train it within the DeepExplain context:

with DeepExplain(session=current_session) as de:  # <-- init DeepExplain context

    model = Sequential();
    model.add(Embedding(input_dim=4218+1, output_dim=32, input_length=100)); # input_length=29;, input_dim=max_words
    model.add(Flatten());
    model.add(Dense(100, activation='relu')); # input_shape=(max_words,)
    model.add(Dropout(0.5));
    #model.add(Dense(4, activation='softmax'));
    model.add(Dense(4, activation='linear'));
    model.add(Activation('softmax'));
    model.compile(loss='categorical_crossentropy',
                      optimizer='adam',
                      metrics=['accuracy']);
    model.summary();

    model.fit(X_train, y_train,
          batch_size=32,
          epochs=5,
          validation_data=(X_test, y_test),
          verbose=1,
          shuffle=True);

    # predict on test data
    y_pred = model.predict(np.array(X_test));
    y_test = np.array(y_test);

    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_tensor = model.layers[0].output
    input_tensor = model.inputs[0]
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test});

    xs = X_test;
    ys = y_test;
    # Run DeepExplain with the embedding as input
    attributions = de.explain('elrp', model.layers[-2].output * ys, model.layers[1].input, embedding_out);
    print("attributions shape --- {}".format(attributions.shape));

Hi guys, thanks for starting this thread. So, I have been looking into a similar problem and above solution was very helpful in resolving some of the confusion. Here is a working code,

Reference: https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py

from keras.preprocessing import sequence
from keras.models import Sequential, Model, load_model, model_from_yaml
from keras.layers import Dense, Embedding
from keras.layers import LSTM
from keras.layers import Dense, Dropout, Flatten, Activation
from keras import backend as K
from keras.datasets import imdb
import numpy as np

max_features = 20000 # cut-off for the number of unique words in the corpus 
maxlen = 80  #among top max_features most common words
batch_size = 32

print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')

print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

print('Build model...')
model = Sequential()
model.add(Embedding(max_features, output_dim=128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1))
model.add(Activation('softmax'))

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Train...')
model.fit(x_train, y_train,
          batch_size=batch_size,
         epochs=15,
         validation_data=(x_test, y_test))

loaded_model.compile(loss='binary_crossentropy',
          optimizer='adam',
          metrics=['accuracy'])
score, acc = loaded_model.evaluate(x_test, y_test,
                        batch_size=batch_size)

print('Test score:', score)
print('Test accuracy:', acc)

# Save and persist the trained model
model_yaml = model.to_yaml()
with open("model_lstm.yaml", "w") as yaml_file:
    yaml_file.write(model_yaml)
# serialize weights to HDF5
model.save_weights("model_lstm.h5")
print("model persisted on disk")

from deepexplain.tensorflow import DeepExplain
with DeepExplain(session=K.get_session()) as de:
    # load YAML and create model
    yaml_file = open('model_lstm_sigmoid.yaml', 'r')
    loaded_model_yaml = yaml_file.read()
    yaml_file.close()
    loaded_model = model_from_yaml(loaded_model_yaml)
    # load weights into new model
    loaded_model.load_weights("model_lstm_sigmoid.h5")
    print("Loaded model from disk")
    uploaded_model = loaded_model
    input_tensor = uploaded_model.layers[0].input
    xs = np.array([x_test[1]])
    ys = np.array([y_test[1]])

    print('Predicted class : {}'.format(uploaded_model.predict(np.array([x_test[0]]))))
    print('Ground Truth: {}'.format(ys))
    embedding_tensor = uploaded_model.layers[0].output
    input_tensor = uploaded_model.layers[0].input
    embedding_out = de.session.run(embedding_tensor, {input_tensor: xs});
    print(embedding_out.shape)
    attributions = de.explain('elrp', uploaded_model.layers[-2].output * ys,  
    uploaded_model.layers[1].input, embedding_out);

Thanks for sharing the code. Unfortunately this does not work correctly. If you see, upon calling de.explain() the following warning is displayed:

UserWarning: DeepExplain detected you are trying to use an attribution method that requires gradient override but the original gradient was used instead. You might have forgot to (re)create your graph within the DeepExlain context. Results are not reliable!

This happens because you did not create (or recreated) the graph within the DeepExplain context. This will work for Gradient*Input, Integrated Gradients and Occlusion but results of LRP and DeepLIFT (that require gradient overriding) are just wrong. In any case, this version of LRP cannot be applied to LSTM units, even with a correct implantation, so please consider using another method.

Oh yes, that's right. Thanks for catching that. Also, agreed on this version of LRP not being the right choice for LSTM. I have updated the code One can build a model and then persist it outside the DeepExplain context. Then the model can be loaded into context and the relevant algorithm(Integrated Gradient or) could be applied.

marcoancona / DeepExplain

LRP for text classification - DeepExplain context and attribution sum #6

Reference: https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py