aspamers / siamese

A simple, easy-to-use and flexible siamese neural network implementation for Keras
MIT License
65 stars 16 forks source link

What would the shape be for datasets other than images? #4

Closed faaizuddin closed 4 years ago

faaizuddin commented 4 years ago

I am trying to use your example_mnist_siamese code. I have my own data where x_train.shape = (100, 3) and y_train.shape = (100,). I am having trouble defining the shape for the siamese model. For the image data, you have (28, 28, 1). But if I have some other data (like time series), in the line "base_model = create_base_model(input_shape), what would be the shape be?

aspamers commented 4 years ago

You just need to define the base_model with the correct input shape. If you wanted to train the siamese network on time series data you would provide a shape like input_shape = (number_of_steps, number_of_channels).

Here is a very simple example of a time series base model:

def create_base_model(input_shape):
    model_input = Input(shape=input_shape)
    embedding = LSTM(4)(model_input)
    embedding = Flatten()(embedding)
    embedding = Dense(128)(embedding)

    return Model(model_input, embedding)

If looks like in your input data you have 100 samples with 3 channels of data. I am going to assume that you have not done the necessary preprocessing yet. Have a look at keras timeseries preprocessing for information on how extract windows from your data before feeding them into the model: https://keras.io/api/preprocessing/timeseries/

faaizuddin commented 4 years ago
def create_base_model(input_shape):
    model_input = Input(shape=input_shape)
    embedding = LSTM(8)(model_input)
    embedding = Flatten()(embedding)
    embedding = Dense(128)(embedding)
    return Model(model_input, embedding)

def create_head_model(embedding_shape):
    embedding_a = Input(shape=embedding_shape)
    embedding_b = Input(shape=embedding_shape)

    head = Concatenate()([embedding_a, embedding_b])
    head = Dense(4)(head)
    head = BatchNormalization()(head)
    head = Activation(activation='sigmoid')(head)

    head = Dense(1)(head)
    head = BatchNormalization()(head)
    head = Activation(activation='sigmoid')(head)

    return Model([embedding_a, embedding_b], head)

input_shape = (3, 3)  # (number_of_steps, number_of_channels)
base_model = create_base_model(input_shape)
head_model = create_head_model(base_model.output_shape)
siamese_network = SiameseNetwork(base_model, head_model)

This code gives me this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-28-e31c62f6b6cb> in <module>
----> 1 base_model = create_base_model(input_shape)
      2 # head_model = create_head_model(base_model.output_shape)
      3 # siamese_network = SiameseNetwork(base_model, head_model)

<ipython-input-24-3d07d40922ec> in create_base_model(input_shape)
      1 def create_base_model(input_shape):
      2     model_input = Input(shape=input_shape)
----> 3     embedding = LSTM(8)(model_input)
      4     embedding = Flatten()(embedding)
      5     embedding = Dense(128)(embedding)

~\AppData\Roaming\Python\Python37\site-packages\keras\layers\recurrent.py in __call__(self, inputs, initial_state, constants, **kwargs)
    539 
    540         if initial_state is None and constants is None:
--> 541             return super(RNN, self).__call__(inputs, **kwargs)
    542 
    543         # If any of `initial_state` or `constants` are specified and are Keras

~\AppData\Roaming\Python\Python37\site-packages\keras\backend\tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
     73         if _SYMBOLIC_SCOPE.value:
     74             with get_graph().as_default():
---> 75                 return func(*args, **kwargs)
     76         else:
     77             return func(*args, **kwargs)

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
    473 
    474             # Handle mask propagation.
--> 475             previous_mask = _collect_previous_mask(inputs)
    476             user_kwargs = kwargs.copy()
    477             if not is_all_none(previous_mask):

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in _collect_previous_mask(input_tensors)
   1439             inbound_layer, node_index, tensor_index = x._keras_history
   1440             node = inbound_layer._inbound_nodes[node_index]
-> 1441             mask = node.output_masks[tensor_index]
   1442             masks.append(mask)
   1443         else:

AttributeError: 'Node' object has no attribute 'output_masks'
aspamers commented 4 years ago

AttributeError: 'Node' object has no attribute 'output_masks'

That error usually happens when you mix tensorflow.keras packages with pure keras. This package is written to use the pure keras api. If you replace any references to tf.keras or tensorflow.keras with keras throughout your code it should work

faaizuddin commented 4 years ago

Now I only have pure Keras. While keeping the other things as they are, I get this error due to shape. I just use your example to generate random data.

x_train = np.random.rand(100, 3)
y_train = np.random.randint(2, size=100)

x_test = np.random.rand(30, 3)
y_test = np.random.randint(2, size=30)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-22-c61eaa44cbab> in <module>
     22 
     23 input_shape = (3, 3)  # (number_of_steps, number_of_channels)
---> 24 base_model = create_base_model(input_shape)
     25 head_model = create_head_model(base_model.output_shape)
     26 siamese_network = SiameseNetwork(base_model, head_model)

<ipython-input-22-c61eaa44cbab> in create_base_model(input_shape)
      2     model_input = Input(shape=input_shape)
      3     embedding = LSTM(4)(model_input)
----> 4     embedding = Flatten()(embedding)
      5     embedding = Dense(128)(embedding)
      6     return Model(model_input, embedding)

~\AppData\Roaming\Python\Python37\site-packages\keras\backend\tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
     73         if _SYMBOLIC_SCOPE.value:
     74             with get_graph().as_default():
---> 75                 return func(*args, **kwargs)
     76         else:
     77             return func(*args, **kwargs)

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
    444                 # Raise exceptions in case the input is not compatible
    445                 # with the input_spec specified in the layer constructor.
--> 446                 self.assert_input_compatibility(inputs)
    447 
    448                 # Collect input shapes to build layer.

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in assert_input_compatibility(self, inputs)
    356                                      self.name + ': expected min_ndim=' +
    357                                      str(spec.min_ndim) + ', found ndim=' +
--> 358                                      str(K.ndim(x)))
    359             # Check dtype.
    360             if spec.dtype is not None:

ValueError: Input 0 is incompatible with layer flatten_8: expected min_ndim=3, found ndim=2
faaizuddin commented 4 years ago

I cannot seem to get the correct shape for the LSTM input, which requires a 3D shape, I believe.

aspamers commented 4 years ago

The input dimensions from that example are not appropriate for a time series model. Time series data has 3 dimensions of input:

x_train = np.random.rand(100, 3, 3)

This would generate 100 random windows with 3 time steps and 3 channels which would fit into the model that you have defined.

faaizuddin commented 4 years ago

Up until that point. I have not even input the data into the model. The error occurs even before model.fit(). And so the same issue remains.

x_train = np.random.rand(100, 3, 3)
y_train = np.random.randint(2, size=100)

x_test = np.random.rand(30, 3, 3)
y_test = np.random.randint(2, size=30)

def create_base_model(input_shape):
    model_input = Input(shape=input_shape)
    embedding = LSTM(4)(model_input)
    embedding = Flatten()(embedding)
    embedding = Dense(128)(embedding)
    return Model(model_input, embedding)

def create_head_model(embedding_shape):
    embedding_a = Input(shape=embedding_shape)
    embedding_b = Input(shape=embedding_shape)

    head = Concatenate()([embedding_a, embedding_b])
    head = Dense(4)(head)
    head = BatchNormalization()(head)
    head = Activation(activation='sigmoid')(head)

    head = Dense(1)(head)
    head = BatchNormalization()(head)
    head = Activation(activation='sigmoid')(head)

    return Model([embedding_a, embedding_b], head)

input_shape = (100, 3, 3)  # (number_of_steps, number_of_channels)
base_model = create_base_model(input_shape)
head_model = create_head_model(base_model.output_shape)
siamese_network = SiameseNetwork(base_model, head_model)

ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

faaizuddin commented 4 years ago

Can you please copy paste the code and see if you get the same error?

aspamers commented 4 years ago

input_shape = (100, 3, 3) # this leads to (?, 100, 3, 3) = 4 dimensions

This line should go back to what it was before:

input_shape = (3, 3) # this leads to (?, 3, 3) = 3 dimensions

expected ndim=3, found ndim=4

The input layer will always add one variable length dimension which is why your model now complains about a 4th dimension.

faaizuddin commented 4 years ago

This is becoming funny. 😛

ValueError: Input 0 is incompatible with layer flatten_13: expected min_ndim=3, found ndim=2

aspamers commented 4 years ago

Yeah its because of tensorflow 2.0 :). The way that shapes are handled have changed a bit :facepalm:. I will just take the time to update the repo since people are migrating to it..

Edit this line:

head_model = create_head_model(base_model.output_shape)

Replace with:

head_model = create_head_model(base_model.output_shape[1:])

faaizuddin commented 4 years ago

still the same issue

aspamers commented 4 years ago

I am going to have to go for now. I will try running the code tomorrow and see what can be done to get it running.

faaizuddin commented 4 years ago

Thanks. 😄 I am yet to see any implementation of Siamese networks for Time Series (classification) on the internet.

aspamers commented 4 years ago

Ok I ran the code and quite a few things have changed in the newer version of keras/tensorflow. Ended up fixing a bug on the master branch so you should definitely reinstall the siamese package.

This code ran without issue for me on a google colab notebook:

import keras
from keras.datasets import mnist
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Activation, Concatenate, LSTM
from keras import backend as K
from keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.models import Model
from keras.layers import Input, Flatten, Dense
import numpy as np
from siamese import SiameseNetwork

batch_size = 128
epochs = 999999

x_train = np.random.rand(100, 3, 3)
y_train = np.random.randint(2, size=100)

x_test = np.random.rand(30, 3, 3)
y_test = np.random.randint(2, size=30)

def create_base_model(input_shape):
    model_input = Input(shape=input_shape)

    embedding = LSTM(4)(model_input)
    embedding = Flatten()(embedding)
    embedding = Dense(128)(embedding)

    return Model(model_input, embedding)

def create_head_model(embedding_shape):
    embedding_a = Input(shape=embedding_shape)
    embedding_b = Input(shape=embedding_shape)

    head = Concatenate()([embedding_a, embedding_b])
    head = Dense(4)(head)
    head = BatchNormalization()(head)
    head = Activation(activation='sigmoid')(head)

    head = Dense(1)(head)
    head = Activation(activation='sigmoid')(head)

    return Model([embedding_a, embedding_b], head)

input_shape = (3, 3)  # (number_of_steps, number_of_channels)
base_model = create_base_model(input_shape)
head_model = create_head_model(base_model.output_shape[1:])
siamese_network = SiameseNetwork(base_model, head_model)

siamese_network.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

siamese_checkpoint_path = "./siamese_checkpoint"

siamese_callbacks = [
    EarlyStopping(monitor='val_accuracy', patience=10, verbose=0),
    ModelCheckpoint(siamese_checkpoint_path, monitor='val_accuracy', save_best_only=True, verbose=0)
]

siamese_network.fit(x_train, y_train,
                    validation_data=(x_test, y_test),
                    batch_size=128,
                    epochs=epochs,
                    callbacks=siamese_callbacks,
                    verbose=1)

# This assumes tensorflow 2.0 is being used
siamese_network.load_weights(siamese_checkpoint_path + '/variables/variables')
# siamese_network.load_weights(siamese_checkpoint_path) # Use this if not tensorflow 2.0
faaizuddin commented 4 years ago

thanks. it works when I remove embedding = Flatten()(embedding) How does it work for you with that line? Also, after

siamese_network.fit(x_train, y_train,
                    validation_data=(x_test, y_test),
                    batch_size=128,
                    epochs=epochs,
                    callbacks=siamese_callbacks,
                    verbose=1)

how do you make predictions? I thought simply running siamese_network.predict([x1, x2]) would suffice. I need to check if two time series data are similar or not.

aspamers commented 4 years ago

thanks. it works when I remove embedding = Flatten()(embedding)

Different versions of keras/tensorflow handle LSTMs and shapes a bit differently. That might be the reason.

how do you make predictions?

You have some options here: 1) You can directly use the output of siamese_network.predict([x1, x2]) to measure similarity between windows. 2) You can use the same technique that is used in the mnist siamese example and attach a softmax layer to the output of the now trained base_model. You can then train a plain old classifier using the weights that were learned with the siamese network as a starting point. This is called transfer learning. 3) You can get the embeddings for each window using base_model.predict(x1). https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526