This version of DeepLIFT has been tested with Keras 2.2.4 & tensorflow 1.14.0. See this FAQ question for information on other implementations of DeepLIFT that may work with different versions of tensorflow/pytorch, as well as a wider range of architectures. See the tags for older versions.
This repository implements the methods in "Learning Important Features Through Propagating Activation Differences" by Shrikumar, Greenside & Kundaje, as well as other commonly-used methods such as gradients, gradient-times-input (equivalent to a version of Layerwise Relevance Propagation for ReLU networks), guided backprop and integrated gradients.
Here is a link to the slides and the video of the 15-minute talk given at ICML. Here is a link to a longer series of video tutorials. Please see the FAQ and file a github issue if you have questions.
Note: when running DeepLIFT for certain computer vision tasks you may get better results if you compute contribution scores of some higher convolutional layer rather than the input pixels. Use the argument find_scores_layer_idx
to specify which layer to compute the scores for.
Please be aware that figuring out optimal references is still an open problem. Suggestions on good heuristics for different applications are welcome. In the meantime, feel free to look at this github issue for general ideas: https://github.com/kundajelab/deeplift/issues/104
Please feel free to follow this repository to stay abreast of updates.
DeepLIFT is on pypi, so it can be installed using pip:
pip install deeplift
If you want to be able to make edits to the code, it is recommended that you clone the repository and install using the --editable
flag.
git clone https://github.com/kundajelab/deeplift.git #will clone the deeplift repository
pip install --editable deeplift/ #install deeplift from the cloned repository. The "editable" flag means changes to the code will be picked up automatically.
While DeepLIFT does not require your models to be trained with any particular library, we have provided autoconversion functions to convert models trained using Keras into the DeepLIFT format. If you used a different library to train your models, you can still use DeepLIFT if you recreate the model using DeepLIFT layers.
This implementation of DeepLIFT was tested with tensorflow 1.7, and autoconversion was tested using keras 2.0.
These examples show how to autoconvert a keras model and obtain importance scores. Non-keras models can be converted to DeepLIFT if they are saved in the keras 2.0 format
#Convert a keras sequential model
import deeplift
from deeplift.conversion import kerasapi_conversion as kc
#NonlinearMxtsMode defines the method for computing importance scores.
#NonlinearMxtsMode.DeepLIFT_GenomicsDefault uses the RevealCancel rule on Dense layers
#and the Rescale rule on conv layers (see paper for rationale)
#Other supported values are:
#NonlinearMxtsMode.RevealCancel - DeepLIFT-RevealCancel at all layers (used for the MNIST example)
#NonlinearMxtsMode.Rescale - DeepLIFT-rescale at all layers
#NonlinearMxtsMode.Gradient - the 'multipliers' will be the same as the gradients
#NonlinearMxtsMode.GuidedBackprop - the 'multipliers' will be what you get from guided backprop
#Use deeplift.util.get_integrated_gradients_function to compute integrated gradients
#Feel free to email avanti [dot] shrikumar@gmail.com if anything is unclear
deeplift_model =\
kc.convert_model_from_saved_files(
saved_hdf5_file_path,
nonlinear_mxts_mode=deeplift.layers.NonlinearMxtsMode.DeepLIFT_GenomicsDefault)
#Specify the index of the layer to compute the importance scores of.
#In the example below, we find scores for the input layer, which is idx 0 in deeplift_model.get_layers()
find_scores_layer_idx = 0
#Compile the function that computes the contribution scores
#For sigmoid or softmax outputs, target_layer_idx should be -2 (the default). This computes explanations
# w.r.t. the logits. (See "3.6 Choice of target layer" in https://arxiv.org/abs/1704.02685 for justification)
#For regression tasks with a linear output, target_layer_idx should be -1 (which simply refers to the last layer)
#Note that in the case of softmax outputs, it may be a good idea to normalize the softmax logits so
# that they sum to zero across all tasks. This ensures that if a feature is contributing equally to
# to all the softmax logits, it will effectly be seen as contributing to none of the tasks (adding
# a constant to all logits of a softmax does not change the output). This is discussed in
# https://github.com/kundajelab/deeplift/issues/116. One way to efficiently acheive this
# normalization is to mean-normalize the weights going into the Softmax layer as
# discussed in eqn. 21 in Section 2.5 of https://arxiv.org/pdf/1605.01713.pdf ("A note on Softmax Activation")
#If you want the DeepLIFT multipliers instead of the contribution scores, you can use get_target_multipliers_func
deeplift_contribs_func = deeplift_model.get_target_contribs_func(
find_scores_layer_idx=find_scores_layer_idx,
target_layer_idx=-1)
#You can also provide an array of indices to find_scores_layer_idx to get scores for multiple layers at once
#compute scores on inputs
#input_data_list is a list containing the data for different input layers
#eg: for MNIST, there is one input layer with with dimensions 1 x 28 x 28
#In the example below, let X be an array with dimension n x 1 x 28 x 28 where n is the number of examples
#task_idx represents the index of the node in the output layer that we wish to compute scores.
#Eg: if the output is a 10-way softmax, and task_idx is 0, we will compute scores for the first softmax class
scores = np.array(deeplift_contribs_func(task_idx=0,
input_data_list=[X],
batch_size=10,
progress_update=1000))
This will work for sequential models involving dense and/or conv1d/conv2d layers and linear/relu/sigmoid/softmax or prelu activations. Please create a github issue or email avanti [dot] shrikumar@gmail.com readme if you are interested in support for other layer types.
The syntax for using functional models is similar; you can use deeplift_model.get_name_to_layer().keys()
to get a list of layer names when figuring out how to specify find_scores_layer_name
and pre_activation_target_layer_name
:
deeplift_model =\
kc.convert_model_from_saved_files(
saved_hdf5_file_path,
nonlinear_mxts_mode=deeplift.layers.NonlinearMxtsMode.DeepLIFT_GenomicsDefault)
#The syntax below for obtaining scores is similar to that of a converted graph model
#See deeplift_model.get_name_to_layer().keys() to see all the layer names
#As before, you can provide an array of names to find_scores_layer_name
#to get the scores for multiple layers at once
deeplift_contribs_func = deeplift_model.get_target_contribs_func(
find_scores_layer_name="name_of_input_layer",
pre_activation_target_layer_name="name_goes_here")
A notebook replicating the results in the paper on MNIST is at examples/mnist/MNIST_replicate_figures.ipynb
, and a notebook demonstrating use on a genomics model with 1d convolutions is at examples/genomics/genomics_simulation.ipynb
.
The 15-minute talk from ICML gives an intuition for the method. Here are links to the slides and the video (the video truncates the slides, which is why the slides are linked separately). Please file a github issue if you have questions.
My first suggestion would be to look at DeepSHAP/DeepExplainer (Lundberg & Lee), DeepExplain (Ancona et al.) or Captum (if you are using pytorch) to see if any of them satisfy your needs. They are implemented by overriding gradient operators and thus support a wider variety of architectures. However, none of these implementations support the RevealCancel rule (which deals with failure modes such as the min function). The pros and cons of DeepSHAP vs DeepExplain are discussed in more detail below. If you would really like to have the RevealCancel rule, go ahead and post a github issue, although my energies are currently focused on other projects and I may not be able to get to it for some time.
Note for people in genomics planning to use TF-MoDISco: for DeepSHAP, I have a custom branch of the DeepSHAP repository that has functionality for computing hypothetical importance scores. A colab notebook demonstrating the use of that repository is here, and a tutorial I made on DeepSHAP for genomics is here.
Both DeepExplain (Ancona et al.) and DeepSHAP/DeepExplainer work by overriding gradient operators, and can thus support a wider variety of architectures than those that are covered in the DeepLIFT repo (in fact, the DeepSHAP/DeepExplainer implementation was inspired by Ancona et al.'s work and builds on a connection between DeepLIFT and SHAP, described in the SHAP paper). For the set of architectures described in the DeepLIFT paper, i.e. linear matrix multiplications, convolutions, and single-input nonlinearities (like ReLUs), both these implementations are identical to DeepLIFT with the Rescale rule. However, neither implementation supports DeepLIFT with the RevealCancel rule (a rule that was developed to deal with failure cases such as the min function, and which unfortunately is not easily implemented by overriding gradient operators). The key differences are as follows:
(1) DeepExplain uses standard gradient backpropagation for elementwise operations (such as those present in LSTMs/GRUs/Attention). This will likely violate the summation-to-delta property (i.e. the property that the sum of the attributions over the input is equal to the difference-from-reference of the output). If you have elementwise operations, I recommend you use DeepSHAP/DeepExplainer, which employs a summation-to-delta-preserving backprop rule. The same is technically true for Maxpooling operations when a non-uniform reference is used (though this has not been a salient problem for us in practice); the DeepSHAP/DeepExplainer implementation guarantees summation-to-delta is satisfied for Maxpooling by assigning credit/blame to either the neuron that is the max in the actual input or the neuron that was the max in the reference (this is different from the 'Max' attribution rule proposed in the SHAP paper; that attribution rule does not scale well).
(2) DeepExplain (by Ancona et al.) does not support the dynamic reference that is demonstrated in the DeepLIFT repo (i.e. the case where a different reference is generated according to the properties of the input example, such as the 'dinucleotide shuffled' references used in genomics). I've implemented the dynamic reference feature for DeepSHAP/DeepExplainer (click for a link to the PR). Also, if you are planning to use DeepSHAP for genomics with TF-MoDISco, please see the note above on my custom implementation of DeepSHAP for computing hypothetical importance scores + a link to the slides for a tutorial.
(3) DeepSHAP/DeepExplainer is implemented such that multiple references can be used for a single example, and the final attributions are averaged over each reference. However, the way this is implemented, each GPU batch calculates attributions for a single example, for all references. This means that the DeepSHAP/DeepExplainer implementation might be slow in cases where you have a large number of samples and only one reference. By contrast, DeepExplain (Ancona et al.) is structured such that the user provides a single reference, and this reference is used for all the examples. Thus, DeepExplain (Ancona et al.) allows GPU batching across examples, but does not allow for GPU batching across different references.
In summary, my recommendations are: use DeepSHAP if you have elementwise operations (e.g. GRUs/LSTMs/Attention), a need for dynamic references, or a large number of references compared to samples. Use DeepExplain when you have a large number of samples compared to references.
Poerner et al. conducted a series of benchmarks comparing DeepLIFT to other explanation methods on NLP tasks. Their implementation differs from the canonical DeepLIFT implementation in two main ways. First, they considered only the Rescale rule of DeepLIFT (according to the implementation here). Second, to handle operations that involve multiplications with gating units (which DeepLIFT was not designed for), they treat the gating neuron as a weight (similar to the approach in Arras et al.) and assign all importance to the non-gating neuron. Note that this differs from the implementation in DeepSHAP/DeepExplainer, which handles elementwise multiplications using a backprop rule base on SHAP and would assign importance to the gating neuron. We have not studied the appropriateness of Arras et al.'s approach, but the authors did find that "LIMSSE, LRP (Bach et al., 2015) and DeepLIFT (Shrikumar et al., 2017) are the most effective explanation methods (§4): LRP and DeepLIFT are the most consistent methods, while LIMSSE wins the hybrid document experiment." (They did not compare with the DeepSHAP/DeepExplainer implementation)
As illustrated in the DeepLIFT paper, the RevealCancel rule of DeepLIFT can allow DeepLIFT to properly handle cases where integrated gradients may give misleading results. Independent researchers have found that DeepLIFT with just the Rescale rule performs comparably to Integrated Gradients (they write: “Integrated Gradients and DeepLIFT have very high correlation, suggesting that the latter is a good (and faster) approximation of the former in practice”). Their finding was consistent with our own personal experience. The speed improvement of DeepLIFT relative to Integrated Gradients becomes particularly useful when using a collection of references (since having a collection of references per example increases runtime).
At the moment, we do not. However, if you are able to convert your model into the saved file format used by the Keras 2 API, then you can use this branch to load it into the DeepLIFT format. For inspiration on how to achieve this, you can look at examples/convert_models/keras1.2to2
for a notebook demonstrating how to convert models saved in the keras1.2 format to keras 2. DeepLIFT conversion works directly from keras saved files without ever actually loading the models into keras. If you have a pytorch model, you may also be interested in the Captum implementation.
A negative contribution score on an input means that the input contributed to moving the output below its reference value, where the reference value of the output is the value that it has when provided the reference input. A negative contribution does not mean that the input is "unimportant". If you want to find inputs that DeepLIFT considers "unimportant" (i.e. DeepLIFT thinks they don't influence the output of the model much), these would be the inputs that have contribution scores near 0.
Just as you supply input_data_list
as an argument to the scoring function, you can also supply input_references_list
. It would have the same dimensions as input_data_list
, but would contain reference images for each input.
The choice of reference depends on the question you wish to ask of the data. Generally speaking, the reference should retain the properties you don't care about and scramble the properties you do care about. In the supplement of the DeepLIFT paper, Appendix L looks at the results on a CIFAR10 model with two different choices of the reference. You'll notice that when a blurred version of the input is used as a reference, the outlines of objects stand out. When a black reference is used, the results are more confusing, possibly because the net is also highlighting color. If you have a particular reference in mind, it is a good idea to check that the output of the model on that reference is consistent with what you expect. Another idea to consider is using multiple different references to interpret a single image and averaging the results over all the different references. We use this approach in genomics; we generate a collection of references per input sequence by shuffling the sequence (this is demonstrated in the genomics example notebook).
It is fine to average the DeepLIFT contribution scores across examples. Be aware that there might be considerable heterogeneity in your data (i.e. some inputs may be very important for some subset of examples but not others, some inputs may contribute positively on some examples and negatively on others) so clustering may prove more insightful than averaging. For the purpose of feature selection, a reasonable heuristic would be to rank inputs in descending order of the average magnitude of the DeepLIFT contribution scores.
Yes. Rather than providing a single numpy array to input_data_list, provide a list of numpy arrays containing the input to each mode. You can also provide a dictionary to input_data_list where the key is the mode name and the value is the numpy array. Each numpy array should have the first axis be the sample axis.
Also yes. Just provide a list to find_scores_layer_name
rather than a single argument.
MIT License. While we had originally filed a patent on some of our interpretability work, we have since disbanded the patent as it appears this project has enough interest from the community to be best distributed in open-source format.
You are likely thinking of TF-MoDISco. Here is a link to that code.
Please email avanti [dot] shrikumar [at] gmail.com with questions, ideas, feature requests, etc. If I don't respond, keep emailing me until I feel guilty and respond. Also feel free to email my adviser (anshul [at] kundaje [dot] net), who can further guilt me into responding. I promise I do actually want to respond; I'm just busy with other things because the incentive structure of academia doesn't reward maintenance of projects.
This section explains finer aspects of the deeplift implementation
The layer (deeplift.layers.core.Layer
) is the basic unit. deeplift.layers.core.Dense
and deeplift.layers.convolution.Conv2D
are both examples of layers.
Layers implement the following key methods:
Returns symbolic variables representing the activations of the layer. For an understanding of symbolic variables, refer to the documentation of symbolic computation packages like theano or tensorflow.
Returns symbolic variables representing the positive/negative multipliers on this layer (for the selected output). See paper for details.
Returns symbolic variables representing the importance scores. This is a convenience function that returns self.get_pos_mxts()*self._pos_contribs() + self.get_neg_mxts()*self._neg_contribs()
. See paper for details.
Here are the steps necessary to implement a forward pass. If executed correctly, the results should be identical (within numerical precision) to a forward pass of your original model, so this is definitely worth doing as a sanity check. Note that if autoconversion (as described in the quickstart) is an option, you can skip steps (1) and (2).
set_inputs
function. The argument to set_inputs
depends on what the layer expects
deeplift.backend.function([input_layer.get_activation_vars()...], output_layer.get_activation_vars())
model.get_layers()
for sequential models (where this function would return a list of layers) or model.get_name_to_layer()
for Graph models (where this function would return a dictionary mapping layer names to layers) deeplift.util.run_function_in_batches(func, input_data_list)
to run the function in batches (which would be advisable if you want to call the function on a large number of inputs that wont fit in memory)
func
is simply the compiled function returned by deeplift.backend.function
input_data_list
is a list of numpy arrays containing data for the different input layers of the network. In the case of a network with one input, this will be a list containing one numpy arrayrun_function_in_batches
are batch_size
and progress_update
Here are the steps necessary to implement the backward pass, which is where the importance scores are calculated. Ideally, you should create a model through autoconversion (described in the quickstart) and then use model.get_target_contribs_func
or model.get_target_multipliers_func
. Howver, if that is not an option, read on (please also consider sending us a message to let us know, as if there is enough demand for a feature we will consider adding it). Note the instructions below assume you have done steps (1) and (2) under the forward pass section.
reset_mxts_updated()
. This resets the symbolic variables for computing the multipliers. If this is the first time you are compiling the backward pass, this step is not strictly necessary.set_scoring_mode(deeplift.layers.ScoringMode.OneAndZeros)
.
set_scoring_mode
on all the target layers that you might conceivably want to find the scores with respect to. This will save you from having to recompile the function to allow a different target layer later.update_mxts()
. This will create the symbolic variables that compute the multipliers with respect to the layer specified in step 2.Compile the importance score computation function with
deeplift.backend.function([input_layer.get_activation_vars()...,
input_layer.get_reference_vars()...],
layer_to_find_scores_for.get_target_contrib_vars())
get_target_contrib_vars()
which returns the importance scores (in the case of NonlinearMxtsMode.DeepLIFT
, these are called "contribution scores"), you can use get_pos_mxts()
or get_neg_mxts()
to get the multipliers.set_active()
on the layer.update_task_index(task_idx)
on the layer. Here task_idx
is the index of a neuron within the layer.deeplift.util.run_function_in_batches
to do this.set_inactive()
on the layer. Don't forget this!