tensorflow / neural-structured-learning

Training neural models with structured signals.
https://www.tensorflow.org/neural_structured_learning
Apache License 2.0
980 stars 189 forks source link

Regardless of Epochs some dataset's returning 'graph_loss: 0.0000e+00' #45

Closed smestern closed 4 years ago

smestern commented 4 years ago

Hello, I am a bit new to the world of deep NN so there may be a simple answer to this. I am using two custom dataset's both (aprox. 2500 samples large, shape (700,1)). I have been using a graph neural network on both. One appears to be functioning normally, however, the other consistently has a graph_loss of 0.0000e+00 regardless of number of epochs. Is this to be expected? Does this mean that the graph for this dataset are not contributing anything? The code more-or-less follows the https://www.tensorflow.org/neural_structured_learning/tutorials/graph_keras_lstm_imdb tutorial exactly.

arjung commented 4 years ago

Thanks for the question, @smestern. In general, a graph loss of 0 means that a sample's embedding/prediction matches that of its neighbors. However, I am not sure if that is genuinely the case in your workflow or if there is some sort of a pilot error. Could you provide more details on your workflow?

How are you invoking the graph regularization APIs in NSL? What graph-specific hyperparameters (number of neighbors, multiplier for graph regularization etc) are you using? A minimal code example would also help us understand the issue better. Are you doing just training or evaluation or both?

smestern commented 4 years ago

Hello, I fear It is likely a pilot error. The dataset itself consists of feature vectors of raw waveforms recorded from neurons using patch-clamp electrophysiology. To build the graph, I am using previously computed sparse principal component values (~14 PC) for each waveform as the 'embeddings' (not sure if this is an appropriate use). Then I use the build graph function and the pack neighbours' function. My hparams are:

self.distance_type = nsl.configs.DistanceType.L2  
self.graph_regularization_multiplier = 0.1  
self.num_neighbors = 3

and my modified model is:

def build_base_model():
  """Builds a model according to the architecture defined in `hparams`."""
  inputs = tf.keras.Input(
      shape=(HPARAMS.max_seq_length,), dtype=tf.float32, name='waves')

  x = inputs
  x = tf.keras.layers.Reshape((HPARAMS.max_seq_length,1), input_shape=(HPARAMS.max_seq_length,))(x)
  for i, num_filters in enumerate(HPARAMS.conv_filters):
    x = tf.keras.layers.Conv1D(
        num_filters, HPARAMS.kernel_size, activation='relu')(
            x)
    if i < len(HPARAMS.conv_filters) - 1:
      # max pooling between convolutional layers
      x = tf.keras.layers.MaxPooling1D(HPARAMS.pool_size)(x)
  x = tf.keras.layers.Flatten()(x)
  for num_units in HPARAMS.num_fc_units:
    x = tf.keras.layers.Dense(num_units, activation='relu')(x)
  pred = tf.keras.layers.Dense(HPARAMS.num_classes, activation='softmax')(x)
  model = tf.keras.Model(inputs=inputs, outputs=pred)
  return model

I am hoping to use the model in a semi-supervised manner to both train and evaluate

arjung commented 4 years ago

Thanks @smestern! Just from your description above, I don't see anything obviously amiss. Are you seeing the graph loss to be 0 in training mode? During evaluation, the graph loss is expected to be 0 because we don't compute it then.

To help debug further, it might help to reduce the # input examples/batch size in your data set. You can also try and print the sample's logits and the neighbor's logits to see what those values are.