Closed smestern closed 4 years ago
Thanks for the question, @smestern. In general, a graph loss of 0 means that a sample's embedding/prediction matches that of its neighbors. However, I am not sure if that is genuinely the case in your workflow or if there is some sort of a pilot error. Could you provide more details on your workflow?
How are you invoking the graph regularization APIs in NSL? What graph-specific hyperparameters (number of neighbors, multiplier for graph regularization etc) are you using? A minimal code example would also help us understand the issue better. Are you doing just training or evaluation or both?
Hello, I fear It is likely a pilot error. The dataset itself consists of feature vectors of raw waveforms recorded from neurons using patch-clamp electrophysiology. To build the graph, I am using previously computed sparse principal component values (~14 PC) for each waveform as the 'embeddings' (not sure if this is an appropriate use). Then I use the build graph function and the pack neighbours' function. My hparams are:
self.distance_type = nsl.configs.DistanceType.L2
self.graph_regularization_multiplier = 0.1
self.num_neighbors = 3
and my modified model is:
def build_base_model():
"""Builds a model according to the architecture defined in `hparams`."""
inputs = tf.keras.Input(
shape=(HPARAMS.max_seq_length,), dtype=tf.float32, name='waves')
x = inputs
x = tf.keras.layers.Reshape((HPARAMS.max_seq_length,1), input_shape=(HPARAMS.max_seq_length,))(x)
for i, num_filters in enumerate(HPARAMS.conv_filters):
x = tf.keras.layers.Conv1D(
num_filters, HPARAMS.kernel_size, activation='relu')(
x)
if i < len(HPARAMS.conv_filters) - 1:
# max pooling between convolutional layers
x = tf.keras.layers.MaxPooling1D(HPARAMS.pool_size)(x)
x = tf.keras.layers.Flatten()(x)
for num_units in HPARAMS.num_fc_units:
x = tf.keras.layers.Dense(num_units, activation='relu')(x)
pred = tf.keras.layers.Dense(HPARAMS.num_classes, activation='softmax')(x)
model = tf.keras.Model(inputs=inputs, outputs=pred)
return model
I am hoping to use the model in a semi-supervised manner to both train and evaluate
Thanks @smestern! Just from your description above, I don't see anything obviously amiss. Are you seeing the graph loss to be 0 in training mode? During evaluation, the graph loss is expected to be 0 because we don't compute it then.
To help debug further, it might help to reduce the # input examples/batch size in your data set. You can also try and print the sample's logits and the neighbor's logits to see what those values are.
Hello, I am a bit new to the world of deep NN so there may be a simple answer to this. I am using two custom dataset's both (aprox. 2500 samples large, shape (700,1)). I have been using a graph neural network on both. One appears to be functioning normally, however, the other consistently has a graph_loss of 0.0000e+00 regardless of number of epochs. Is this to be expected? Does this mean that the graph for this dataset are not contributing anything? The code more-or-less follows the https://www.tensorflow.org/neural_structured_learning/tutorials/graph_keras_lstm_imdb tutorial exactly.