marcoancona / DeepExplain

A unified framework of perturbation and gradient-based attribution methods for Deep Neural Networks interpretability. DeepExplain also includes support for Shapley Values sampling. (ICLR 2018)
https://arxiv.org/abs/1711.06104
MIT License
734 stars 133 forks source link

Support for multiple inputs #5

Closed vagarwal87 closed 6 years ago

vagarwal87 commented 6 years ago

Hi I'm having a little difficulty on a regression example and wondered if you could help me a bit. The structure of the network is such that it has two inputs, only one of which is fed into convolutional layers before they are ultimatly joined at dense layers, ultimately outputting into a single neuron which was trained to minimize mean squared error loss. I'm trying to load a pre-trained Keras model and my code looks like this:

import os, h5py
from keras.models import load_model, Model
from deepexplain.tensorflow import DeepExplain

    model = load_model(data_file) #load h5 file w/ graph and pre-trained weights
    with DeepExplain(session=K.get_session()) as de:
        input_tensor = [model.layers[0].input, model.layers[6].input] #the 6th index is where the 2nd input is
        fModel = Model(inputs = input_tensor, outputs = model.layers[-1].output) #assuming -1 as this is a regression problem

        xs = data[0:1,:,:] #grab first test example
        xs2 = data2[0:1,:] #grab first test example of input type 2
        gradinput = de.explain('grad*input', fModel(input_tensor), input_tensor, [xs, xs2])

This leads to the following output/error:

DeepExplain: running "gradinput" explanation method (2) Traceback (most recent call last): File "deep_explain.py", line 51, in main() File "deep_explain.py", line 39, in main gradinput = de.explain('gradinput', fModel(input_tensor), input_tensor, [xs, xs2]) File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 414, in explain result = method.run() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 91, in run attributions = self.get_symbolic_attribution() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 156, in get_symbolic_attribution return tf.gradients(self.T, self.X)[0] * self.X File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 793, in binary_op_wrapper y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y") File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 637, in convert_to_tensor as_ref=False) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 702, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 905, in _autopacking_conversion_function return _autopacking_helper(v, inferred_dtype, name or "packed") File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 868, in _autopacking_helper return gen_array_ops._pack(elems_as_tensors, name=scope) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2041, in _pack result = _op_def_lib.apply_op("Pack", values=values, axis=axis, name=name) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2329, in create_op set_shapes_for_outputs(ret) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1717, in set_shapes_for_outputs shapes = shape_func(op) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1667, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "/home/vagarwal/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Shapes must be equal rank, but are 3 and 2 From merging shape 0 with other shapes. for 'mul_53/y' (op: 'Pack') with input shapes: [?,10500,4], [?,6].

Is there something simple wrong w/ my syntax or method? I'm using Keras 2.0.2 w/ Tensorflow 1.0.1 backend. Thank you for your help and package development!

marcoancona commented 6 years ago

Right, DeepExplain did not support multiple inputs but I just pushed a fix to make it happen. See this minimal example. Let me know if it works for you now.

vagarwal87 commented 6 years ago

Glad the error was fixed and appreciate the code update -- looks like everything is working smoothly in my hands too. Closing issue.

vagarwal87 commented 6 years ago

Sorry to reopen issue but just wanted to let you know about a few more errors I encountered.

When I run the sample code you give on my data, everything works fine for: grad*input, elrp, and deeplift techniques.

However, for the saliency map, I receive back a matrix of 0s (with no error reported), and for integrated gradients I receive an error that doesn't finish the computation. Is it difficult to fix these methods to allow multiple inputs as well, or is there any plan to fix these in the future?

Thanks in advance!

marcoancona commented 6 years ago

Hi, These methods should be already supported. I will have to look into it to see where the problem is. It would be very useful if you could provide a minimal example were these methods fail.

For integrated gradients, you can try to use a batch size of 1 and pass the parameter steps=10 to the explain method to check if with fewer steps it works. Also, can you report the exact error/stacktrace you get here?

vagarwal87 commented 6 years ago

Ok, please find attached a minimal example of 10 samples (running with batch size = 1) to reproduce the error, and relevant code, dummydata.zip

I think I fixed the saliency problem...it was my fault. For the intgrad problem, batch size = 1 and steps=10 did not fix the problem. I run:

python deep_explain2.py model.h5 testdummy.h5

the output of this is:

DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "gradinput" explanation method (2) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "saliency" explanation method (1) DeepExplain: running "intgrad" explanation method (3) Traceback (most recent call last): File "deep_explain2.py", line 40, in main() File "deep_explain2.py", line 35, in main map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2]) File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 451, in explain result = method.run() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 214, in run xs_mod = (np.array(self.xs) * alpha).tolist() ValueError: could not broadcast input array from shape (10500,4) into shape (1)

I'm not 100% positive about how to fix the issue, but is it because you assume the input is numerical? in my case the input is boolean.

Along these lines, another error I'm getting is that when I pass in a baseline such as this:

baseline = [np.repeat(np.array([[0.2,0.2,0.3,0.3]]), 10500, axis=0), np.zeros(6)] map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = baseline)

you can see the dimensions are of baseline are (10500, 4) and (6,). However, the script fails with:

Traceback (most recent call last): File "deep_explain.py", line 52, in main() File "deep_explain.py", line 45, in main map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = [xs, xs2]) File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 451, in explain result = method.run() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 284, in run self._set_check_baseline() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 106, in _set_check_baseline % (self.baseline[i].shape, self.xs[i].shape[1:])) RuntimeError: Baseline shape (1, 10500, 4) does not match expected shape (10500, 4)

I think somewhere in your function, it is adding an extra dimension before checking for correct dimensionality, because if I pass in this same set of matrices within the _set_check_baseline function, it passes the 'if' statement and executes successfully.

Best wishes

marcoancona commented 6 years ago

About the baseline issue. Are you passing a baseline with batch dimension?

When provided, baseline must be a numpy array with the size of the input, without the batch dimension (default: zero).

The baseline has no batch dimension because the same baseline is used for all the inputs both with Integrated Gradient and DeepLIFT. I will look into the issue.

marcoancona commented 6 years ago

The problem with integrated gradients was caused by lack of support for multiple inputs of different shape. Now this is fixed. Thanks for the code example. It now works, except for the perturbation methods that does not support multiple inputs.

If one of your input is binary (ie. boolean), it might be a problem for Int Grad because it is supposed to gradually vary the input from a baseline value to its actually provided value.

vagarwal87 commented 6 years ago

"Are you passing a baseline with batch dimension?" From the example I've given, the input is [xs, xs2], where the dimension of xs is 1 x 10500 x 4, and dimension of xs2 is 1 x 6, and the 1 is the batch dimension. Therefore I was thinking that the correct dimension for the baseline (after having read your README) should be [bl1, bl2], where bl1 is 10500 x 4 and bl2 is 6 (which removes the batch dimension). Please let me know if this is not the case. Adding the line to my above script:

baseline = [np.repeat(np.array([[0.2,0.2,0.3,0.3]]), 10500, axis=0), np.zeros(6)] map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = baseline)

should reproduce the error.

"The problem with integrated gradients was caused by lack of support for multiple inputs of different shape. Now this is fixed." Thanks, glad to hear! Regarding Boolean input, I'll switch this to floating point during training to see if it changes the outcome. It could be that Tensorflow automatically treats Boolean as a 0/1 floating point matrix.

marcoancona commented 6 years ago

Yes, your understanding of the baseline is correct. But I just realized, from the stacktrace you posted above, you are using the following line:

map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = [xs, xs2])

where the baseline is actually identical to the input. This is causing the problem. Your sample code works for me with the baseline.

vagarwal87 commented 6 years ago

I accidentally posted the wrong stacktrace...but I figured out the source of my error, it was just a strange Python error. When I run this:

import os, h5py
from optparse import OptionParser
from keras.models import load_model, Model
import numpy as np
from keras import backend as K
from deepexplain.tensorflow import DeepExplain

batchsize = 1
baseline2 = [np.repeat(np.array([[0.2, 0.2, 0.3, 0.3]]), 10500, axis=0), np.zeros(6)]

def main():
    usage = 'usage: %prog [options] <data_file> <testfile>'
    parser = OptionParser(usage)
    (options,args) = parser.parse_args()

    if len(args) != 2:
        print args
        parser.error('Must provide data file and output directory')
    else:
        data_file = args[0]
        testfile = args[1]

    testfile = h5py.File(testfile, 'r')
    Xin1, Xin2, y_test = testfile['data'], testfile['data2'], testfile['label']
    model = load_model(data_file)

    with DeepExplain(session=K.get_session()) as de:
        input_tensor = model.inputs
        fModel = Model(inputs = input_tensor, outputs = model.outputs)

        for method in ['deeplift', 'elrp', 'occlusion']: #'grad*input', 
            for i in range(0,10):
                xs = Xin2[i*batchsize:(i*batchsize+batchsize),3000:13500,:]
                xs2 = Xin1[i*batchsize:(i*batchsize+batchsize),:]
                ys = y_test[i*batchsize:(i*batchsize+batchsize)]
                map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = baseline2)
                X, X2 = map[0], map[1]
                np.savetxt(method.replace("*", "")+str(i)+'.txt', np.column_stack((ys, 10**4 * np.sum(X, 2), X2)), fmt='%.3f')

if __name__ == '__main__':
    main()

it results in the following stacktrace:

File "deep_explain2.py", line 41, in main() File "deep_explain2.py", line 36, in main map = de.explain(method, fModel(input_tensor), input_tensor, [xs, xs2], baseline = baseline2) File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 452, in explain result = method.run() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 285, in run self._set_check_baseline() File "/mnt/c/Users/vagar/Downloads/keras3/src/deepexplain/deepexplain/tensorflow/methods.py", line 107, in _set_check_baseline % (self.baseline[i].shape, self.xs[i].shape[1:])) RuntimeError: Baseline shape (1, 10500, 4) does not match expected shape (10500, 4)

However when I define baseline2 within the scope of the main function, everything mysteriously runs fine, I tend to use other languages more than Python so this is somewhat strange behavior to me in handling global vs local scope...in any case, thank you for your patience in helping me and I'll close out the issue again.