philipperemy / cond_rnn

Conditional RNNs for Tensorflow / Keras.
MIT License
225 stars 32 forks source link

Adding dropout layer to stacked conditional RNNs #42

Closed adityadivekar03 closed 1 year ago

adityadivekar03 commented 1 year ago

I'm trying to build the following architecture inspired from the stacked lstm example code in the repository. The only difference is that I also include Dropout layers between two stacked LSTM layers.

x = ConditionalRecurrent(LSTM(64,
                              batch_input_shape=(batchSize, num_samples, num_features), 
                              activation='tanh', #'relu', 
                              return_sequences=True, stateful=stateful))([i, c])
x = Dropout(0.2)(x)    
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)    
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

This however ends up giving me the below error due to the Dropout layer:

Exception encountered when calling layer "dropout_114" (type Dropout).

Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.

Call arguments received:
  • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
  • training=False
Traceback (most recent call last):
  File "<ipython-input-111-1334bb85e127>", line 295, in run_LSTM
    history, model, Y_pred = train_and_get_predictions(initial_layer, model_x_train, Y_train, model_x_test)
  File "<ipython-input-111-1334bb85e127>", line 169, in train_and_get_predictions
    trainingModel = createModel_v2(initial_layer,
  File "<ipython-input-111-1334bb85e127>", line 130, in createModel_v2
    x = Dropout(0.2)(x)
  File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py", line 102, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Exception encountered when calling layer "dropout_114" (type Dropout).

Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.

Call arguments received:
  • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
  • training=False

Could you help me understand if adding Dropout layers is supported and I'm doing it wrong, or its known that its not supported? Thanks! Happy to provide more context if needed.

philipperemy commented 1 year ago

Yes Dropout is supported. I pasted below a snippet with your code that works for me.

import numpy as np
from tensorflow.keras.layers import Dense, LSTM, Dropout, LeakyReLU, Input
from tensorflow.keras.models import Model

from cond_rnn import ConditionalRecurrent

BS = None
TIME_STEPS = 10
INPUT_DIM = 2
NUM_CLASSES = 1
STATEFUL = False
NUM_CELLS = 20

i = Input(shape=[TIME_STEPS, INPUT_DIM], name='input_0')
c = Input(shape=[NUM_CLASSES], name='input_1')

x = ConditionalRecurrent(
    LSTM(
        64,
        batch_input_shape=(BS, TIME_STEPS, INPUT_DIM),
        activation='tanh',  # 'relu',
        return_sequences=True, stateful=STATEFUL
    ))([i, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

model2 = Model(inputs=[i, c], outputs=[x])
print(model2.predict([np.zeros(shape=(30, TIME_STEPS, INPUT_DIM)), np.zeros(shape=(30, NUM_CLASSES))]).shape)

Output

2023-03-14 09:11:21.681654: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
1/1 [==============================] - 0s 323ms/step
(30, 10, 78)

If you want to use stateful=True, you have to specify the batch_size in the inputs. It has to be fixed because a state is persisted. So when you unroll your sequences one step at a time, Keras will save its states in memory.

Stateful=True is here:

import numpy as np
from tensorflow.keras.layers import Dense, LSTM, Dropout, LeakyReLU, Input
from tensorflow.keras.models import Model

from cond_rnn import ConditionalRecurrent

BS = 30
TIME_STEPS = 10
INPUT_DIM = 2
NUM_CLASSES = 1
STATEFUL = True
NUM_CELLS = 20

i = Input(batch_input_shape=[BS, TIME_STEPS, INPUT_DIM], name='input_0')
c = Input(batch_input_shape=[BS, NUM_CLASSES], name='input_1')

x = ConditionalRecurrent(
    LSTM(
        64,
        batch_input_shape=(BS, TIME_STEPS, INPUT_DIM),
        activation='tanh',  # 'relu',
        return_sequences=True, stateful=STATEFUL
    ))([i, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

model2 = Model(inputs=[i, c], outputs=[x])
print(model2.predict([np.zeros(shape=(BS, TIME_STEPS, INPUT_DIM)), np.zeros(shape=(BS, NUM_CLASSES))]).shape)