How can Integrated Gradients be used for a BERT model with 3 inputs

SeldonIO / alibi

Algorithms for explaining machine learning models

https://docs.seldon.io/projects/alibi/en/stable/

Other

2.39k stars 249 forks source link

How can Integrated Gradients be used for a BERT model with 3 inputs #495

Open nouna99 opened 3 years ago

nouna99 commented 3 years ago

When i tried to use IG for my 3inputs model i got this error :

explanation = ig.explain(X_test_sample1['input_type_ids'].numpy().argmax(axis=1), baselines=None, target=None)

ValueError Traceback (most recent call last)

in () 2 explanation = ig.explain(X_test_sample1['input_type_ids'].numpy().argmax(axis=1), 3 baselines=None, ----> 4 target=None) 4 frames /usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name) 200 str(len(input_spec)) + ' input(s), ' 201 'but it received ' + str(len(inputs)) + --> 202 ' input tensors. Inputs received: ' + str(inputs)) 203 for input_index, (x, spec) in enumerate(zip(inputs, input_spec)): 204 if spec is None: ValueError: Layer model expects 3 input(s), but it received 1 input tensors. Inputs received: [] The inputs of my model : {'input_word_ids': , 'input_mask': , 'input_type_ids': }

gipster commented 3 years ago

Hi @nouna99 . Looks like you are trying to calculate the integrated gradients with respect to input_type_ids. I believe in BERT models this is an input mask similar to the attention mask. These masks must be passed as forward_kwargs to explain and it is not possible to calculate the integrated gradients with respect to these.

https://github.com/SeldonIO/alibi/blob/a009ef079f3b32328d9805e658d1ae5e00c6d7ff/alibi/explainers/integrated_gradients.py#L720 "forward_kwargs Input keyword args. If it's not None, it must be a dict with numpy arrays as values. The first dimension of the arrays must correspond to the number of examples. It will be repeated for each of n_steps along the integrated path. The attributions are not computed with respect to these arguments. " It seems to me that in your example input, you can calculate integrated gradients for 'input_word_ids', and you should pass 'input_mask' and 'input_type_ids' as forward_kwargs .

jklaise commented 3 years ago

This example shows how it's done for BERT using forward_kwargs: https://docs.seldon.io/projects/alibi/en/latest/examples/integrated_gradients_transformers.html

nouna99 commented 3 years ago

Thank you for your help I tried this : X1_test = bert_encode(x1_test, tokenizerSaved, 151) X_test_sample=x1_test[:10] X_test_sample1=bert_encode(X_test_sample, tokenizerSaved, 151) print(X1_test) print('................................................................*****') print(X_test_sample1['input_word_ids']) kwargs = {k:v for k,v in X_test_sample1.items() if k == 'input_mask'}

.....................;;

explanation = ig.explain(X_test_sample1['input_word_ids'], baselines=None, target=None)

However i got this error 👍 ValueError Traceback (most recent call last)

in () 1 ----> 2 explanation = ig.explain(X_test_sample1['input_word_ids'], baselines=None, target=None) /usr/local/lib/python3.7/dist-packages/alibi/explainers/integrated_gradients.py in explain(self, X, forward_kwargs, baselines, target, attribute_to_layer_inputs) 793 794 else: --> 795 raise ValueError("Input must be a np.ndarray or a list of np.ndarray") 796 797 # defining integral method ValueError: Input must be a np.ndarray or a list of np.ndarray

gipster commented 3 years ago

Hi @nouna99 . The error says that your input is not a numpy array, and it looks like you are not passing the forward_kwargs to explain. However, it is very difficult to help you if you don't include some self contained code that we can easily run.

nouna99 commented 3 years ago

the input is below, it's not an array but a tensor. Did IG requires an array as input ?: tf.Tensor( [[ 101 10212 27876 ... 0 0 0] [ 101 13684 10151 ... 0 0 0] [ 101 13501 11337 ... 0 0 0] ... [ 101 27224 102422 ... 0 0 0] [ 101 10114 49151 ... 0 0 0] [ 101 11760 10110 ... 0 0 0]], shape=(10, 151), dtype=int32)

jklaise commented 3 years ago

@nouna99 yes, a numpy array is expected, you can see this in the explain method docstring.

I believe the example notebook I referenced above should answer all your questions as it's exactly the same use case.

nouna99 commented 3 years ago

Hello i used bumpy to convert the input tensor to an array explanation = ig.explain(X_test_sample1['input_word_ids'].numpy(), forward_kwargs=kwargs, baselines=None, target=None)

this time i got this error. Please help me !

ValueError Traceback (most recent call last)

in () 1 ----> 2 explanation = ig.explain(X_test_sample1['input_word_ids'].numpy(), forward_kwargs=kwargs, baselines=None, target=None) 4 frames /usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name) 200 str(len(input_spec)) + ' input(s), ' 201 'but it received ' + str(len(inputs)) + --> 202 ' input tensors. Inputs received: ' + str(inputs)) 203 for input_index, (x, spec) in enumerate(zip(inputs, input_spec)): 204 if spec is None: ValueError: Layer model expects 3 input(s), but it received 1 input tensors. Inputs received: []