danielegrattarola / spektral

Graph Neural Networks with Keras and Tensorflow 2.
https://graphneural.network
MIT License
2.37k stars 334 forks source link

Issues with batchloader #229

Closed raphaelmourad closed 3 years ago

raphaelmourad commented 3 years ago

Hello Daniele,

I am trying to use batchLoader because I have a set of graphs with the same dimension. But as far as I tried I could not make it work when I use model.fit(). Note that I cannot use your ready to use fonction spektral.models.gcn.GCN(), since I do some feature engineering with classical convolution within my model.

I could easily make my own dataset using examples you provided (thanks for that):

dataset = MyDataset(1,seqlen)
dataset.graphs

[Graph(n_nodes=5, n_node_features=200, n_edge_features=None, n_labels=5), Graph(n_nodes=5, n_node_features=200, n_edge_features=None, n_labels=5), Graph(n_nodes=5, n_node_features=200, n_edge_features=None, n_labels=5), Graph(n_nodes=5, n_node_features=200, n_edge_features=None, n_labels=5), Graph(n_nodes=5, n_node_features=200, n_edge_features=None, n_labels=5), .................................]

I built batchLoaders from the dataset containing many graphs:

loader_tr = BatchLoader(dataset, batch_size=1)
loader_va = BatchLoader(dataset, batch_size=1)

Here is my data from the loader. For each batch, I have a matrix of 5 samples and 200 features, and a graph of 5x5 as inputs, as output the node class (5 binary values, one for each sample).

inputs, target = loader_tr.__next__()
print(inputs[0])
print(inputs[1])
print(target)

[[[1 2 2 2 3 2 1 3 1 3 3 0 1 1 1 0 1 3 2 0 2 1 1 0 3 2 1 2 3 2 0 2 1 0 3 2 3 2 3 0 3 2 1 0 2 2 1 0 3 0 3 0 3 2 0 0 1 0 3 2 3 2 3 0 3 2 1 3 1 0 3 0 3 3 1 0 0 1 0 3 2 2 0 3 0 3 1 0 0 1 2 1 0 1 0 3 0 2 0 3 0 0 1 0 1 0 3 0 1 2 0 0 2 3 2 2 3 3 3 2 0 0 3 0 0 2 0 0 0 1 2 0 0 0 2 0 1 3 3 2 1 0 1 0 2 2 1 3 1 0 2 2 3 2 3 3 3 2 0 2 1 1 2 3 3 2 2 1 1 3 1 1 0 2 3 3 2 2 3 2 2 1 0 1 3 2 3 3 3 2 2 2 2 0 2 0 2 2 2 3] [3 3 0 0 1 0 3 0 0 2 1 2 3 2 0 1 0 3 3 0 0 0 1 1 0 0 1 1 1 0 0 3 3 1 0 0 2 3 0 3 2 3 0 1 3 2 3 3 1 1 1 1 3 0 3 0 3 2 2 0 2 0 0 0 0 0 2 1 0 0 1 0 3 3 3 0 2 0 2 3 2 0 0 0 3 3 3 2 0 0 0 3 2 3 3 3 3 0 3 2 1 1 1 3 3 2 3 3 0 2 1 0 3 0 3 0 3 3 0 3 3 3 3 2 0 0 2 1 0 3 2 0 0 0 3 0 3 1 0 1 3 3 1 1 0 0 0 0 0 3 1 0 3 3 0 0 2 3 2 0 2 2 3 0 3 3 1 1 0 0 2 2 1 1 3 3 3 3 3 2 2 0 2 2 1 1 3 0 1 2 3 1 3 2 0 1 1 3 1 3] [3 2 1 1 3 1 3 1 0 3 0 1 0 3 0 1 0 3 2 0 2 1 0 3 2 1 0 1 0 3 2 1 0 3 2 1 0 1 0 1 0 3 2 0 0 1 0 3 3 1 0 3 0 3 2 1 3 1 0 3 0 1 3 1 0 0 1 0 3 2 2 1 3 0 1 0 2 0 3 0 1 0 1 0 3 0 1 0 3 0 1 0 3 0 3 2 0 0 1 3 1 0 1 0 3 1 1 2 3 2 2 3 0 2 3 3 3 2 0 0 3 2 0 2 2 0 0 1 2 0 0 3 2 2 1 1 3 2 1 0 1 0 2 2 1 3 1 0 2 2 3 2 3 3 3 2 0 2 1 1 2 3 3 2 2 1 1 3 1 1 0 2 3 1 2 2 3 2 2 1 0 1 3 2 3 3 3 2 2 2 2 0 2 0 3 0 2 3 2 0] [3 3 3 3 1 1 3 3 3 2 3 3 1 0 3 3 3 2 3 3 3 3 2 3 2 3 1 3 3 0 0 0 3 3 1 3 0 1 0 3 2 2 0 0 2 3 2 0 0 0 3 1 0 3 2 2 1 0 3 3 3 2 2 1 0 2 0 0 0 0 0 3 1 0 0 0 0 2 0 0 2 0 3 3 0 0 1 0 3 3 3 1 2 3 2 2 3 0 1 0 3 2 0 0 0 0 1 3 0 3 1 0 1 0 3 3 0 3 1 0 0 0 3 3 1 3 2 2 3 2 3 1 3 2 3 0 0 0 3 0 0 0 2 3 3 3 3 0 3 3 2 1 0 0 1 0 1 0 2 1 1 0 0 2 1 3 1 0 1 3 1 0 3 3 1 0 3 2 3 0 3 3 2 3 1 3 2 3 2 2 1 3 2 1 3 3 3 1 0 3] [0 1 0 0 0 2 3 3 3 3 2 2 1 1 0 0 1 3 0 2 3 3 0 3 2 2 0 0 1 0 3 1 1 3 3 2 0 3 2 0 0 0 3 0 1 3 3 1 1 1 1 1 1 1 1 3 1 1 1 3 3 1 3 3 1 0 1 3 1 3 1 3 3 3 0 2 2 0 3 2 3 2 0 3 2 0 1 0 0 3 2 2 0 3 2 2 1 1 3 1 1 3 1 3 0 3 2 0 1 1 3 1 0 1 0 2 0 2 0 0 2 1 0 0 2 3 1 3 0 3 1 0 3 0 3 3 2 2 2 1 2 0 1 0 2 2 3 3 1 3 1 0 3 2 2 1 3 3 3 2 2 3 2 0 2 0 1 3 3 2 0 1 0 2 0 1 0 3 0 1 3 3 0 3 2 2 2 3 3 3 3 2 2 0 0 2 2 2 0 2]]] [[[0.2 0.4472136 0.4472136 0.4472136 0.4472136] [0. 1. 0. 0. 0. ] [0. 0. 1. 0. 0. ] [0. 0. 0. 1. 0. ] [0. 0. 0. 0. 1. ]]] [[1 0 0 0 0]]

Here is my model: seqlen=200 N=5

# CNN model with GNN layer
X_in=Input(shape=(seqlen,))
A_in=Input(shape=(N,),sparse=False)

CNN=OneHot(input_dim=vocab_size, input_length=seqlen)(X_in)
CNN=Conv1D(num_filters, kernel_size, activation='relu', input_shape = [seqlen,vocab_size])(CNN)
CNN=GlobalMaxPooling1D()(CNN)
CNN=GaussianNoise(0.01)(CNN)

y_out=GCNConv(5,activation='relu')([CNN,A_in]) 
y_out=Dense(1,activation='sigmoid')(y_out)

model=Model(inputs=[X_in,A_in], outputs=y_out)
model.summary()

When I run:

history=model.fit(
    loader_tr.load(),
    steps_per_epoch=loader_tr.steps_per_epoch,
    validation_data=loader_va.load(),
    validation_steps=loader_va.steps_per_epoch,
    epochs=epochs,
    callbacks=[EarlyStopping(patience=es_patience, restore_best_weights=True)]
)

I get this message (see bellow). I understand this is due to the fact that the loader gives me a batch (of size 1), thus I have (5, 200, 1) as input feature node matrix. But even after trying to change many times my code, I could not make it work.

Epoch 1/200 WARNING:tensorflow:Model was constructed with shape (None, 200) for input KerasTensor(type_spec=TensorSpec(shape=(None, 200), dtype=tf.float32, name='input_56'), name='input_56', description="created by layer 'input_56'"), but it was called on an input with incompatible shape (None, None, None). WARNING:tensorflow:Model was constructed with shape (None, 5) for input KerasTensor(type_spec=TensorSpec(shape=(None, 5), dtype=tf.float32, name='input_57'), name='input_57', description="created by layer 'input_57'"), but it was called on an input with incompatible shape (None, None, None).


ValueError Traceback (most recent call last)

in ----> 1 history=model.fit( 2 loader_tr.load(), 3 steps_per_epoch=loader_tr.steps_per_epoch, 4 validation_data=loader_va.load(), 5 validation_steps=loader_va.steps_per_epoch, ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing) 1098 _r=1): 1099 callbacks.on_train_batch_begin(step) -> 1100 tmp_logs = self.train_function(iterator) 1101 if data_handler.should_sync: 1102 context.async_wait() ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds) 826 tracing_count = self.experimental_get_tracing_count() 827 with trace.Trace(self._name) as tm: --> 828 result = self._call(*args, **kwds) 829 compiler = "xla" if self._experimental_compile else "nonXla" 830 new_tracing_count = self.experimental_get_tracing_count() ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds) 869 # This is the first call of __call__, so we have to initialize. 870 initializers = [] --> 871 self._initialize(args, kwds, add_initializers_to=initializers) 872 finally: 873 # At this point we know that the initialization is complete (or less ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to) 723 self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph) 724 self._concrete_stateful_fn = ( --> 725 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access 726 *args, **kwds)) 727 ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs) 2967 args, kwargs = None, None 2968 with self._lock: -> 2969 graph_function, _ = self._maybe_define_function(args, kwargs) 2970 return graph_function 2971 ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs) 3359 3360 self._function_cache.missed.add(call_context_key) -> 3361 graph_function = self._create_graph_function(args, kwargs) 3362 self._function_cache.primary[cache_key] = graph_function 3363 ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes) 3194 arg_names = base_arg_names + missing_arg_names 3195 graph_function = ConcreteFunction( -> 3196 func_graph_module.func_graph_from_py_func( 3197 self._name, 3198 self._python_function, ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes) 988 _, original_func = tf_decorator.unwrap(python_func) 989 --> 990 func_outputs = python_func(*func_args, **func_kwargs) 991 992 # invariant: `func_outputs` contains only Tensors, CompositeTensors, ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds) 632 xla_context.Exit() 633 else: --> 634 out = weak_wrapped_fn().__wrapped__(*args, **kwds) 635 return out 636 ~/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs) 975 except Exception as e: # pylint:disable=broad-except 976 if hasattr(e, "ag_error_metadata"): --> 977 raise e.ag_error_metadata.to_exception(e) 978 else: 979 raise ValueError: in user code: /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:805 train_function * return step_function(self, iterator) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:795 step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:1259 run return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica return fn(*args, **kwargs) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:788 run_step ** outputs = model.train_step(data) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:754 train_step y_pred = self(x, training=True) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__ outputs = call_fn(inputs, *args, **kwargs) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py:424 call return self._run_internal_graph( /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py:560 _run_internal_graph outputs = node.layer(*args, **kwargs) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:998 __call__ input_spec.assert_input_compatibility(self.input_spec, inputs, self.name) /home/raphael/anaconda3/envs/deeplearning/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py:219 assert_input_compatibility raise ValueError('Input ' + str(input_index) + ' of layer ' + ValueError: Input 0 of layer global_max_pooling1d_23 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, None, None, 500)
danielegrattarola commented 3 years ago

Hi, you have the wrong input shapes for batch mode. They should be (N, F) and (N, N) respectively (since the batch size is implicit when defining Input layers).

One way to circumvent this kind of issues is to use the more modern model subclassing approach, give it a spin :)

Cheers

raphaelmourad commented 3 years ago

Ok I got it using reshaping with Lambda layer using keras backend!