shap / shap

A game theoretic approach to explain the output of any machine learning model.
https://shap.readthedocs.io
MIT License
22.89k stars 3.29k forks source link

shap_values fails to run on keras model with BatchNormalization layer #1110

Open PPere5 opened 4 years ago

PPere5 commented 4 years ago

Hi,

After banging my head against the problem for the past hours I decided to please ask for your assistance.

I am working with a keras model with the following layout:

____________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 53)]              0
_________________________________________________________________
batch_normalization (BatchNo (None, 53)                212       
_________________________________________________________________
activation (Activation)      (None, 53)                0
_________________________________________________________________
h1 (Dense)                   (None, 64)                3456
_________________________________________________________________
batch_normalization_1 (Batch (None, 64)                256
_________________________________________________________________
activation_1 (Activation)    (None, 64)                0
_________________________________________________________________
h2 (Dense)                   (None, 64)                4160
_________________________________________________________________
batch_normalization_2 (Batch (None, 64)                256
_________________________________________________________________
activation_2 (Activation)    (None, 64)                0
_________________________________________________________________
h3 (Dense)                   (None, 64)                4160
_________________________________________________________________
batch_normalization_3 (Batch (None, 64)                256
_________________________________________________________________
activation_3 (Activation)    (None, 64)                0
_________________________________________________________________
h4 (Dense)                   (None, 32)                2080
_________________________________________________________________
batch_normalization_4 (Batch (None, 32)                128
_________________________________________________________________
activation_4 (Activation)    (None, 32)                0
_________________________________________________________________
h5 (Dense)                   (None, 16)                528
_________________________________________________________________
batch_normalization_5 (Batch (None, 16)                64
_________________________________________________________________
activation_5 (Activation)    (None, 16)                0
_________________________________________________________________
=================================================================
Total params: 15,573
Trainable params: 14,987
Non-trainable params: 586
_________________________________________________________________

after training the model, running:

explainer = shap.DeepExplainer(loaded_model, data) shap_values = explainer.shap_values(data, check_additivity= False)

(sorry I cannot share the whole code confidentiality and such...)

will return an error:

    C:\Users\p\Virtual_Environments\env-ml\lib\site-packages\shap\explainers\deep\deep_tf.py:244 grad_graph  *
    C:\Users\p\Virtual_Environments\env-ml\lib\site-packages\tensorflow_core\python\eager\backprop.py:1029 gradient
        unconnected_gradients=unconnected_gradients)
    C:\Users\p\Virtual_Environments\env-ml\lib\site-packages\tensorflow_core\python\eager\imperative_grad.py:77 imperative_grad
        compat.as_str(unconnected_gradients.value))
    C:\Users\p\Virtual_Environments\env-ml\lib\site-packages\tensorflow_core\python\eager\backprop.py:137 _gradient_function
        grad_fn = ops._gradient_registry.lookup(op_name)  # pylint: disable=protected-access
    C:\Users\p\Virtual_Environments\env-ml\lib\site-packages\tensorflow_core\python\framework\registry.py:97 lookup
        "%s registry has no entry for: %s" % (self._name, name))

    LookupError: gradient registry has no entry for: shap_AddV2

Same code runs fine if I alter the model to remove batchnorm layers. I am using tensorflow 2.1. Could you please shed some light on this issue?

Thank you!

PPere5 commented 4 years ago

Quick update, if I try to use a dropout layer and no batchnorm I get the following error:

Traceback (most recent call last):
  File "c:/Users/p/Models/IP_Models/Neural_Network/IP_Conversion_predict.py", line 61, in <module>
    shap_values = explainer.shap_values(data)
  File "C:\Users\p\Documents\Virtual_Environments\env-ml\lib\site-packages\shap\explainers\deep\__init__.py", line 119, in shap_values
    return self.explainer.shap_values(X, ranked_outputs, output_rank_order, check_additivity=check_additivity)
  File "C:\Users\p\Documents\Virtual_Environments\env-ml\lib\site-packages\shap\explainers\deep\deep_tf.py", line 334, in shap_values
    "as a github issue, with a reproducable example if possible so we can debug it." % np.abs(diffs).max()
AssertionError: The SHAP explanations do not sum up to the model's output! This is either because of a rounding error or because an operator in your computation graph was not fully supported. If the sum difference of 1.027870 is significant compared the scale of your model outputs please post as a github issue, with a reproducable example if possible so we can debug it.
towhid355 commented 4 years ago

@PPere5 what is the activation of the last output layer? sigmoid ?

PPere5 commented 4 years ago

@PPere5 what is the activation of the last output layer? sigmoid ?

Yes

metalwhale commented 4 years ago

Have you tried adding this line before using the DeepExplainer?

shap.explainers.deep.deep_tf.op_handlers["AddV2"] = shap.explainers.deep.deep_tf.passthrough
SaravananOffl commented 4 years ago

Hi, I'm facing the same original issue. I tried using @metalwhale's suggestion, but then I got an another lookup error. Any suggestions to resolve this issue?

Code Snippet:

background = X_train[np.random.choice(X_train.shape[0], 100, replace=False)]# we use the first 100 training examples as our background dataset to integrate over
shap.explainers.deep.deep_tf.op_handlers["AddV2"] = shap.explainers.deep.deep_tf.passthrough
explainer = shap.DeepExplainer(model,  background)

shap_values = explainer.shap_values(background[:10])
explainer.expected_value[0]

Error Log:

----> 9 shap_values = explainer.shap_values(background[:10])
     10 explainer.expected_value[0]
     11 # explainer.shap_values

13 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

StagingError: in user code:

    /usr/local/lib/python3.6/dist-packages/shap/explainers/deep/deep_tf.py:244 grad_graph  *
        x_grad = tape.gradient(out, shap_rAnD)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/backprop.py:1048 gradient  **
        unconnected_gradients=unconnected_gradients)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/imperative_grad.py:77 imperative_grad
        compat.as_str(unconnected_gradients.value))
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/backprop.py:145 _gradient_function
        grad_fn = ops._gradient_registry.lookup(op_name)  # pylint: disable=protected-access
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/registry.py:97 lookup
        "%s registry has no entry for: %s" % (self._name, name))

    LookupError: gradient registry has no entry for: shap_TensorListStack
pierresphabmixay commented 4 years ago

When running explainer.expected_value[0] I'm also getting the error

LookupError: gradient registry has no entry for: shap_TensorListStack

juliotorrest commented 4 years ago

I was having the same issue. Using GradientExplainer instead of DeepExplainer is a temporary solution. Please see: https://github.com/slundberg/shap/issues/885#issuecomment-564778328

vedal commented 4 years ago

I experienced a similar error with another layer and GradientExplainer. Updating tensorflow solved the issue for me.

dupsys commented 4 years ago

I am expressing the same error: explainer = shap.DeepExplainer(model,x_train[:100]) shap_values = explainer.shap_values(x_test[:10])

'' LookupError: gradient registry has no entry for: shap_TensorListStack"

juliotorrest commented 4 years ago

Yeah, I think the best option is to update Tensorflow and SHAP. I am running tf v2.1 and shap v0.36, and I still have issues with DeepExplainer sometimes. However, GradientExplainer seems to work fine, although you do not have the same functions.

dupsys commented 4 years ago

@juliotorrest , thank for you response, I am using current of TensorFlow and shape as:

TensorFlow version: 2.2.1 Shap version: 0.36.0

but same problem

juliotorrest commented 4 years ago

Even with GradientExplainer?

Also, in this thread there were some mentions on how to initialize the gradient. Maybe that works.

ghost commented 4 years ago

Anyone know whether this issue can be solved by either downgrading/updating either TF/Keras or Shap? Not sure which versions that shap is targetting with their 0.36 release.

carloszanella commented 4 years ago

I'm having the same issue using a DeepExplainer on TF 2.3.1 and SHAP 0.36.0. Although it doesn't look nice, adding this line (slightly changed from a snippet of the previous comment) apparently solved my problem and I've no idea if it affects the results of the analysis

shap.explainers._deep.deep_tf.op_handlers["AddV2"] = shap.explainers._deep.deep_tf.passthrough
Mythreyi-V commented 4 years ago

Using TF 2.3.0 and SHAP 0.35.0. This seems to have solved the problem for me:

from tensorflow.compat.v1.keras.backend import get_session
tf.compat.v1.disable_v2_behavior()
niloofardadras commented 3 years ago

Using TF 2.3.0 and SHAP 0.35.0. This seems to have solved the problem for me:


from tensorflow.compat.v1.keras.backend import get_session
tf.compat.v1.disable_v2_behavior()
```Saved my life
ub-sherkhane commented 3 years ago

if you are running in jupyter notebook, then just restart the kernel and run again.

SudilHasitha commented 3 years ago
shap.explainers.deep.deep_tf.op_handlers["AddV2"] = shap.explainers.deep.deep_tf.passthrough

image

emarkou commented 3 years ago

@SudilHasitha try shap.explainers._deep.deep_tf.op_handlers["AddV2"] = shap.explainers._deep.deep_tf.passthrough as suggested by @carloszanella

gianmarco-terrones commented 2 years ago

I built an ANN with TensorFlow 2.3.0 and Keras 2.4.3, and the combination of tf.compat.v1.disable_v2_behavior() and e = shap.DeepExplainer(model, X_train) and shap_values = e.shap_values(X_test, check_additivity=False) makes things run. However, the resulting SHAP values I get, in addition to not satisfying the additivity check (the mean model prediction plus the sum of the SHAP values != model prediction), also do not make sense in the context of model predictions. For example, a test instance that the ANN predicts to have a target property value of 10 will have a lower sum of SHAP values than a test instance that the ANN predicts to have a target property value of 0. So it seems something goes very wrong for SHAP values if you force the SHAP analysis to run in this way. I've been told that this additivity issue does not come up when using TensorFlow 1.14, however.

tldr Forcing the SHAP analysis to run may lead to invalid SHAP values.

Issue #1238 seems related to this problem

gianmarco-terrones commented 2 years ago

Update: using KernelExplainer instead of DeepExplainer seems to fix things! See #1199

SudilHasitha commented 2 years ago

Problem solved thank you @emarkou @gianmarco-terrones

OphirLiu commented 1 year ago
shap.explainers.deep.deep_tf.op_handlers["AddV2"] = shap.explainers.deep.deep_tf.passthrough

image

shap.explainers._deep.deep_tf.op_handlers["AddV2"] = shap.explainers._deep.deep_tf.passthrough

[metalwhale]'s anser missed a ‘_' before deep