SeldonIO / alibi

Algorithms for explaining machine learning models
https://docs.seldon.io/projects/alibi/en/stable/
Other
2.38k stars 248 forks source link

RuntimeError: The Session graph is empty. Add operations to the graph before calling run() #959

Open fraseralex96 opened 1 year ago

fraseralex96 commented 1 year ago

Hi Team,

I get the following error when trying to initialise the CEM explainer:

`--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Input In [52], in <cell line: 4>() 2 shape = (1,) + X_train.shape[1:] 3 mode = 'PN' ----> 4 cem = CEM(model, mode, shape, kappa=0., beta=.1, 5 feature_range=(X_train.min(), X_train.max()), 6 gamma=100, max_iterations=1000, 7 c_init=1., c_steps=10, learning_rate_init=1e-2, 8 clip=(-1000.,1000.), no_info_val=-1.)

File ~/.local/lib/python3.9/site-packages/alibi/explainers/cem.py:107, in CEM.init(self, predict, mode, shape, kappa, beta, feature_range, gamma, ae_model, learning_rate_init, max_iterations, c_init, c_steps, eps, clip, update_num_grad, no_info_val, write_dir, sess) 105 if is_model: # Keras or TF model 106 self.model = True --> 107 classes = self.sess.run(self.predict(tf.convert_to_tensor(np.zeros(shape), dtype=tf.float32))).shape[1] 108 else: 109 self.model = False

File ~/.local/lib/python3.9/site-packages/tensorflow/python/client/session.py:969, in BaseSession.run(self, fetches, feed_dict, options, run_metadata) 966 run_metadata_ptr = tf_session.TF_NewBuffer() if run_metadata else None 968 try: --> 969 result = self._run(None, fetches, feed_dict, options_ptr, 970 run_metadata_ptr) 971 if run_metadata: 972 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

File ~/.local/lib/python3.9/site-packages/tensorflow/python/client/session.py:1119, in BaseSession._run(self, handle, fetches, feed_dict, options, run_metadata) 1117 raise RuntimeError('Attempted to use a closed Session.') 1118 if self.graph.version == 0: -> 1119 raise RuntimeError('The Session graph is empty. Add operations to the ' 1120 'graph before calling run().') 1122 # Create request. 1123 feed_dict_tensor = {}

RuntimeError: The Session graph is empty. Add operations to the graph before calling run().`

My model architecture is as follow:

`## CNN using functional API def DeepIC50_tester(X_train, learning_rate, momentum, seed):

# set layer weights initialiser
initializer = keras.initializers.GlorotUniform(seed=seed)

# drug-cell line data input
x_input = layers.Input(shape=(X_train.shape[1],1))

# 1st convolution layer
x = layers.Conv1D(filters=16, kernel_size=11, kernel_initializer=initializer, activation='relu')(x_input) 
x = layers.BatchNormalization()(x)
#x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

# 2nd convolution layer
x = layers.Conv1D(filters=16, kernel_size=11, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

# 3rd convolution layer
x = layers.Conv1D(filters=32, kernel_size=11, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
#x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

# 4th convolution layer
x = layers.Conv1D(filters=32, kernel_size=11, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

# 5th convolution layer
x = layers.Conv1D(filters=64, kernel_size=11, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

# 6th convolution layer
x = layers.Conv1D(filters=64, kernel_size=11, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling1D()(x)
#x = layers.Dropout(0.1)(x)

x = layers.Flatten()(x)

# 5 fully connected layers with batch norm and dropout
x = layers.Dense(128, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.1)(x)
x = layers.Dense(256, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.1)(x)
x = layers.Dense(512, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.1)(x)
x = layers.Dense(256, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.1)(x)
x = layers.Dense(128, kernel_initializer=initializer, activation='relu')(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.1)(x)

output = layers.Dense(3, activation = 'softmax',activity_regularizer=keras.regularizers.l2())(x) # actiation layer is 3 neuron for each class

model = keras.Model(x_input, output)

model.compile( loss='sparse_categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])
return model`

I run this code when initialising:

`# initialize CEM explainer

shape = (1,) + X_train.shape[1:]

mode = 'PN'

cem = CEM(model, mode, shape, kappa=0., beta=.1, feature_range=(X_train.min(), X_train.max()), gamma=100, max_iterations=1000, c_init=1., c_steps=10, learning_rate_init=1e-2, clip=(-1000.,1000.), no_info_val=-1.)`

Please assist! It would be much appreciated!

Alex

RobertSamoilescu commented 1 year ago

Hi @fraseralex96,

Try to add the following lines as in the example from here:

import tensorflow as tf
tf.get_logger().setLevel(40) # suppress deprecation messages
tf.compat.v1.disable_v2_behavior() # disable TF2 behaviour as alibi code still relies on TF1 constructs
fraseralex96 commented 1 year ago

Hi Robert,

Thank you for your response.

Unfortunately I now receive the following error message when I build my NN model because of the above code:

WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e.,tf.keras.optimizers.legacy.Adam`.

NotImplementedError Traceback (most recent call last) Input In [12], in <cell line: 7>() 4 momentum = 0.5 5 initializer_seed = 42 ----> 7 model = DeepIC50(X_train, learning_rate, momentum, initializer_seed)

File ~/ML_notebook/ML_models.py:354, in DeepIC50(X_train, learning_rate, momentum, seed) 350 output = layers.Dense(3, activation = 'softmax',activity_regularizer=keras.regularizers.l2())(x) # actiation layer is 3 neuron for each class 352 model = keras.Model(x_input, output) --> 354 model.compile( loss='sparse_categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy']) 355 return model

File ~/.local/lib/python3.9/site-packages/tensorflow/python/trackable/base.py:204, in no_automatic_dependency_tracking.._method_wrapper(self, *args, *kwargs) 202 self._self_setattr_tracking = False # pylint: disable=protected-access 203 try: --> 204 result = method(self, args, **kwargs) 205 finally: 206 self._self_setattr_tracking = previous_value # pylint: disable=protected-access

File ~/.local/lib/python3.9/site-packages/keras/src/engine/training_v1.py:321, in Model.compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, distribute, **kwargs) 314 if self.run_eagerly: 315 raise ValueError( 316 "Session keyword arguments are not supported " 317 "when run_eagerly=True. You passed the following " 318 "Session arguments: %s" % (self._function_kwargs,) 319 ) --> 321 self._set_optimizer(optimizer) 322 is_any_keras_optimizer_v1 = any( 323 ( 324 isinstance(opt, optimizer_v1.Optimizer) (...) 327 for opt in tf.nest.flatten(self.optimizer) 328 ) 330 if ( 331 is_any_keras_optimizer_v1 332 and tf.compat.v1.executing_eagerly_outside_functions() 333 ):

File ~/.local/lib/python3.9/site-packages/keras/src/engine/training_v1.py:1473, in Model._set_optimizer(self, optimizer) 1471 self.optimizer = [optimizers.get(opt) for opt in optimizer] 1472 else: -> 1473 self.optimizer = optimizers.get(optimizer) 1475 if self._dtype_policy.name == "mixed_float16" and not isinstance( 1476 self.optimizer, loss_scale_optimizer.LossScaleOptimizer 1477 ): 1478 if isinstance(self.optimizer, list):

File ~/.local/lib/python3.9/site-packages/keras/src/optimizers/init.py:298, in get(identifier, **kwargs) 291 optimizer_name = identifier.class.name 292 logging.warning( 293 "There is a known slowdown when using v2.11+ Keras optimizers " 294 "on M1/M2 Macs. Falling back to the " 295 "legacy Keras optimizer, i.e., " 296 f"tf.keras.optimizers.legacy.{optimizer_name}." 297 ) --> 298 return convert_to_legacy_optimizer(identifier) 300 # Wrap legacy TF optimizer instances 301 elif isinstance(identifier, tf.compat.v1.train.Optimizer):

File ~/.local/lib/python3.9/site-packages/keras/src/optimizers/init.py:222, in convert_to_legacy_optimizer(optimizer) 216 raise ValueError( 217 "convert_to_legacy_optimizer should only be called " 218 "on instances of tf.keras.optimizers.Optimizer, but " 219 f"received {optimizer} of type {type(optimizer)}." 220 ) 221 optimizer_name = optimizer.class.name.lower() --> 222 config = optimizer.get_config() 223 # Remove fields that only exist in experimental optimizer. 224 keys_to_remove = [ 225 "weight_decay", 226 "use_ema", (...) 230 "is_legacy_optimizer", 231 ]

File ~/.local/lib/python3.9/site-packages/keras/src/optimizers/adam.py:211, in Adam.get_config(self) 206 def get_config(self): 207 config = super().get_config() 209 config.update( 210 { --> 211 "learning_rate": self._serialize_hyperparameter( 212 self._learning_rate 213 ), 214 "beta_1": self.beta_1, 215 "beta_2": self.beta_2, 216 "epsilon": self.epsilon, 217 "amsgrad": self.amsgrad, 218 } 219 ) 220 return config

File ~/.local/lib/python3.9/site-packages/keras/src/optimizers/optimizer.py:736, in _BaseOptimizer._serialize_hyperparameter(self, hyperparameter) 734 return learning_rate_schedule.serialize(hyperparameter) 735 if isinstance(hyperparameter, tf.Variable): --> 736 return hyperparameter.numpy() 737 if callable(hyperparameter): 738 return hyperparameter()

File ~/.local/lib/python3.9/site-packages/tensorflow/python/ops/resource_variable_ops.py:689, in BaseResourceVariable.numpy(self) 687 if context.executing_eagerly(): 688 return self.read_value().numpy() --> 689 raise NotImplementedError( 690 "numpy() is only available when eager execution is enabled.")

NotImplementedError: numpy() is only available when eager execution is enabled.`

RobertSamoilescu commented 1 year ago

Try to use tensorflow v2.10:

pip install --upgrade tensorflow==2.10
fraseralex96 commented 1 year ago

Hi Robert,

I get the following error message now when I try to run CEM (and also when I individually run the model I am trying to explain):

`2023-08-03 11:04:01.763603: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled

ValueError Traceback (most recent call last) Input In [22], in <cell line: 4>() 2 shape = X_train.shape 3 mode = 'PN' ----> 4 cem = CEM(model, mode, shape, kappa=0., beta=.1, 5 feature_range=(X_train.min(), X_train.max()), 6 gamma=100, max_iterations=1000, 7 c_init=1., c_steps=10, learning_rate_init=1e-2, 8 clip=(-1000.,1000.), no_info_val=-1.)

File ~/.local/lib/python3.9/site-packages/alibi/explainers/cem.py:235, in CEM.init(self, predict, mode, shape, kappa, beta, feature_range, gamma, ae_model, learning_rate_init, max_iterations, c_init, c_steps, eps, clip, update_num_grad, no_info_val, write_dir, sess) 233 self.pred_proba_s = self.predict(self.delta_s) 234 elif self.mode == "PN": --> 235 self.pred_proba = self.predict(self.adv) 236 self.pred_proba_s = self.predict(self.adv_s) 238 # probability of target label prediction

File ~/.local/lib/python3.9/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.traceback) 68 # To get the full stack trace, call: 69 # tf.debugging.disable_traceback_filtering() ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb

File ~/.local/lib/python3.9/site-packages/keras/engine/input_spec.py:250, in assert_input_compatibility(input_spec, inputs, layer_name) 248 ndim = x.shape.rank 249 if ndim is not None and ndim < spec.min_ndim: --> 250 raise ValueError( 251 f'Input {input_index} of layer "{layer_name}" ' 252 "is incompatible with the layer: " 253 f"expected min_ndim={spec.min_ndim}, " 254 f"found ndim={ndim}. " 255 f"Full shape received: {tuple(shape)}" 256 ) 257 # Check dtype. 258 if spec.dtype is not None:

ValueError: Input 0 of layer "conv1d_18" is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: (Dimension(10427), Dimension(1378))`

I tested it and it is directly because of this code:

import tensorflow as tf tf.get_logger().setLevel(40) # suppress deprecation messages tf.compat.v1.disable_v2_behavior() # disable TF2 behaviour as alibi code still relies on TF1 constructs

Do you have any suggestion?

Thanks! Alex

RobertSamoilescu commented 1 year ago

My guess is that you input shape to the Input layer is not properly set. I see you are initialising it as:

x_input = layers.Input(shape=(X_train.shape[1],1))

I think the correct initialisation is:

x_input = layers.Input(shape=X_train.shape[1:])

You should be able to test if everything is ok by performing a forward pass through your model:

model(X_train[:1])

See the documentation here. Also, since you are using conv1d, your training data must have the following shape (B, L, D), where B is the batch size, L is the sequence length, and D is the channels size.

RobertSamoilescu commented 1 year ago

Related to a previous error you had when initialising the optimizer, you can probably try using the legacy optimizers with the newe vesions of tensorflow (see here). There has been a change to the optimizers from tensorflow 2.11 (see release notes here)