marcoancona / DeepExplain

A unified framework of perturbation and gradient-based attribution methods for Deep Neural Networks interpretability. DeepExplain also includes support for Shapley Values sampling. (ICLR 2018)
https://arxiv.org/abs/1711.06104
MIT License
725 stars 133 forks source link

ValueError: None values not supported #10

Closed andgan closed 6 years ago

andgan commented 6 years ago

I 'm running the following model:

input1 = Input(shape=(length_timestamps,))
x1 = Embedding(size_icd, int(6 * size_icd ** (1. / 3)), input_length=length_timestamps)(input1)

input2 = Input(shape=(length_covar,))
x2 = Dense(2)(input2)

x1 = Dropout(0.7)(x1)
x1 = LSTM(100, kernel_regularizer=l2(0.))(x1)
x = concatenate([x1,x2])
x= Dense(32, activation='relu')(x)
outlayer=Dense(1, activation='sigmoid')(x)

model = Model(inputs=[input1,input2], outputs=outlayer)

adagard_custom = Adagrad(lr=0.01, epsilon=None, decay=0.0)
model.compile(optimizer=adagard_custom, loss='binary_crossentropy', metrics=['acc'])

history = model.fit([train_x[1:10000],train_x_covar[1:10000]], train_y[1:10000], epochs=10, batch_size=312, validation_data=([test_x,test_x_covar], test_y))

and then I try DeepExplain

with DeepExplain(session=K.get_session()) as de:  # <-- init DeepExplain context
    # Need to reconstruct the graph in DeepExplain context, using the same weights.
    input_tensors = model.layers[0].input
    output_tensors = model.layers[4].output
    fModel = Model(inputs = input_tensors, outputs = output_tensors)
    target_tensor = fModel(input_tensors)

    attributions = de.explain('grad*input', target_tensor, input_tensors, train_x[1:10000])
    print ("Attributions:\n", attributions)

and get this error

DeepExplain: running "grad*input" explanation method (2)
Model with multiple inputs:  False
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-49d854e1b4d8> in <module>()
      6     target_tensor = fModel(input_tensors)
      7 
----> 8     attributions = de.explain('grad*input', target_tensor, input_tensors, train_x_covar[1:10000])
      9     print ("Attributions:\n", attributions)

~/src/deepexplain/deepexplain/tensorflow/methods.py in explain(self, method, T, X, xs, **kwargs)
    455         _ENABLED_METHOD_CLASS = method_class
    456         method = _ENABLED_METHOD_CLASS(T, X, xs, self.session, self.keras_phase_placeholder, **kwargs)
--> 457         result = method.run()
    458         if issubclass(_ENABLED_METHOD_CLASS, GradientBasedMethod) and _GRAD_OVERRIDE_CHECKFLAG == 0:
    459             warnings.warn('DeepExplain detected you are trying to use an attribution method that requires '

~/src/deepexplain/deepexplain/tensorflow/methods.py in run(self)
    122 
    123     def run(self):
--> 124         attributions = self.get_symbolic_attribution()
    125         results =  self.session_run(attributions, self.xs)
    126         return results[0] if not self.has_multiple_inputs else results

~/src/deepexplain/deepexplain/tensorflow/methods.py in get_symbolic_attribution(self)
    190         return [g * x for g, x in zip(
    191             tf.gradients(self.T, self.X),
--> 192             self.X if self.has_multiple_inputs else [self.X])]
    193 
    194 

~/src/deepexplain/deepexplain/tensorflow/methods.py in <listcomp>(.0)
    188 
    189     def get_symbolic_attribution(self):
--> 190         return [g * x for g, x in zip(
    191             tf.gradients(self.T, self.X),
    192             self.X if self.has_multiple_inputs else [self.X])]

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py in r_binary_op_wrapper(y, x)
    984   def r_binary_op_wrapper(y, x):
    985     with ops.name_scope(None, op_name, [x, y]) as name:
--> 986       x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
    987       return func(x, y, name=name)
    988 

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, preferred_dtype)
    948       name=name,
    949       preferred_dtype=preferred_dtype,
--> 950       as_ref=False)
    951 
    952 

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx)
   1038 
   1039     if ret is None:
-> 1040       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1041 
   1042     if ret is NotImplemented:

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)
    233                                          as_ref=False):
    234   _ = as_ref
--> 235   return constant(v, dtype=dtype, name=name)
    236 
    237 

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape)
    212   tensor_value.tensor.CopyFrom(
    213       tensor_util.make_tensor_proto(
--> 214           value, dtype=dtype, shape=shape, verify_shape=verify_shape))
    215   dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
    216   const_tensor = g.create_op(

~/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape)
    419   else:
    420     if values is None:
--> 421       raise ValueError("None values not supported.")
    422     # if dtype is provided, forces numpy array to be the type
    423     # provided if possible.

ValueError: None values not supported.

and this is mt layer configuration

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_3 (InputLayer)            (None, 256)          0                                            
__________________________________________________________________________________________________
embedding_2 (Embedding)         (None, 256, 67)      97016       input_3[0][0]                    
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 256, 67)      0           embedding_2[0][0]                
__________________________________________________________________________________________________
input_4 (InputLayer)            (None, 2)            0                                            
__________________________________________________________________________________________________
lstm_2 (LSTM)                   (None, 100)          67200       dropout_2[0][0]                  
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 2)            6           input_4[0][0]                    
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 102)          0           lstm_2[0][0]                     
                                                                 dense_4[0][0]                    
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 32)           3296        concatenate_2[0][0]              
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 1)            33          dense_5[0][0]                    
==================================================================================================
Total params: 167,551
Trainable params: 167,551
Non-trainable params: 0
JoshPrim commented 6 years ago

I have also encountered the same problem. Did you solve the problem?

marcoancona commented 6 years ago

Gradient information cannot propagate through Embedding layers. For NLP models (or any model using lookups), please read https://github.com/marcoancona/DeepExplain#nlp--embedding-lookups The input layer or the "explanation" model should be the first layer after the lookup. In the case of NLP, this means that you compute explanations on the vector representation of each word, instead of the word itself. You also check this discussion.

JoshPrim commented 6 years ago

Hey :) Thanks for the help. How would I have to modify the source code if I have several inputs/outputs in the embedding layer?
For example, in such a network structure:

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 10)           0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            (None, 10)           0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            (None, 10)           0                                            
__________________________________________________________________________________________________
ModellEmbeddings (Embedding)    (None, 10, 32)       172096      input_1[0][0]                    
                                                                 input_2[0][0]                    
                                                                 input_3[0][0]                    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 10, 96)       0           ModellEmbeddings[0][0]           
                                                                 ModellEmbeddings[1][0]           
                                                                 ModellEmbeddings[2][0]           
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, 10, 100)      19300       concatenate_1[0][0]              
__________________________________________________________________________________________________
max_pooling1d_1 (MaxPooling1D)  (None, 5, 100)       0           conv1d_1[0][0]                   
__________________________________________________________________________________________________
lstm_1 (LSTM)                   (None, 100)          80400       max_pooling1d_1[0][0]            
__________________________________________________________________________________________________
main_output (Dense)             (None, 5383)         543683      lstm_1[0][0]                     
===================================================================

The approach described in this does not work for me. I implemented it like:

current_session = K.get_session()

with DeepExplain(session=current_session) as de:

    # load the model 
    model = load_model('model.h5')
    predictions = model.predict([input_0, input_1, input_2], verbose=2) 

    # predict on test data
    X_test = [input_0, input_1, input_2]
    y_pred = model.predict(X_test)

    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_tensor = model.layers[3].output
    input_tensor = model.inputs[0]
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test})

    xs = X_test
    ys = predictions

    # Run DeepExplain with the embedding as input
    attributions = de.explain('elrp', model.layers[-1].output * ys, model.layers[1].input, embedding_out)
    print("attributions shape --- {}".format(attributions.shape))
marcoancona commented 6 years ago

This is a different problem. In this case you have a shared embedding layers used 3 times. Don't you get an error when you try to do model.layers[3].output? Since layer 3 (embedding) is a shared layer, you should be using model.layers[3].get_output_at(index) according to https://keras.io/layers/about-keras-layers/ Another problem with your code is that de.explain is called with input tensor model.layers[1].input instead of the tensor after the embedding lookup. Since you are then concatenating the results of the embedding lookup, I would suggest to call de.explain on the input to the Concatenate layer, like the following (not tested):

current_session = K.get_session()

with DeepExplain(session=current_session) as de:

    # load the model 
    model = load_model('model.h5')
    predictions = model.predict([input_0, input_1, input_2], verbose=2) 

    # predict on test data
    X_test = [input_0, input_1, input_2]
    y_pred = model.predict(X_test)

    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    concat_tensor = model.layers[4].input
    input_tensor = model.inputs[0]
    concat_out = current_session.run(concat_tensor, {input_tensor: X_test})

    xs = X_test
    ys = predictions

    # Run DeepExplain with the embedding as input
    attributions = de.explain('elrp', model.layers[-1].output * ys, concat_tensor, concat_out)
    print("attributions shape --- {}".format(attributions.shape))
marcoancona commented 6 years ago

Close as the original issue is known and covered in the README. Please open a new issue if the second problem persists.