Adding dropout layer to stacked conditional RNNs

philipperemy / cond_rnn

Conditional RNNs for Tensorflow / Keras.

MIT License

225 stars 32 forks source link

x = ConditionalRecurrent(LSTM(64, batch_input_shape=(batchSize, num_samples, num_features), activation='tanh', #'relu', return_sequences=True, stateful=stateful))([i, c]) x = Dropout(0.2)(x) x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=stateful))([x, c]) x = Dropout(0.2)(x) x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=stateful))([x, c]) x = Dropout(0.2)(x) x = Dense(units=78)(x) x = LeakyReLU()(x)

Exception encountered when calling layer "dropout_114" (type Dropout). Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor. Call arguments received: • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520> • training=False Traceback (most recent call last): File "<ipython-input-111-1334bb85e127>", line 295, in run_LSTM history, model, Y_pred = train_and_get_predictions(initial_layer, model_x_train, Y_train, model_x_test) File "<ipython-input-111-1334bb85e127>", line 169, in train_and_get_predictions trainingModel = createModel_v2(initial_layer, File "<ipython-input-111-1334bb85e127>", line 130, in createModel_v2 x = Dropout(0.2)(x) File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py", line 102, in convert_to_eager_tensor return ops.EagerTensor(value, ctx.device_name, dtype) ValueError: Exception encountered when calling layer "dropout_114" (type Dropout). Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor. Call arguments received: • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520> • training=False

Yes Dropout is supported. I pasted below a snippet with your code that works for me.

import numpy as np
from tensorflow.keras.layers import Dense, LSTM, Dropout, LeakyReLU, Input
from tensorflow.keras.models import Model

from cond_rnn import ConditionalRecurrent

BS = None
TIME_STEPS = 10
INPUT_DIM = 2
NUM_CLASSES = 1
STATEFUL = False
NUM_CELLS = 20

i = Input(shape=[TIME_STEPS, INPUT_DIM], name='input_0')
c = Input(shape=[NUM_CLASSES], name='input_1')

x = ConditionalRecurrent(
    LSTM(
        64,
        batch_input_shape=(BS, TIME_STEPS, INPUT_DIM),
        activation='tanh',  # 'relu',
        return_sequences=True, stateful=STATEFUL
    ))([i, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

model2 = Model(inputs=[i, c], outputs=[x])
print(model2.predict([np.zeros(shape=(30, TIME_STEPS, INPUT_DIM)), np.zeros(shape=(30, NUM_CLASSES))]).shape)

Output

2023-03-14 09:11:21.681654: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
1/1 [==============================] - 0s 323ms/step
(30, 10, 78)

If you want to use stateful=True, you have to specify the batch_size in the inputs. It has to be fixed because a state is persisted. So when you unroll your sequences one step at a time, Keras will save its states in memory.

Stateful=True is here:

import numpy as np
from tensorflow.keras.layers import Dense, LSTM, Dropout, LeakyReLU, Input
from tensorflow.keras.models import Model

from cond_rnn import ConditionalRecurrent

BS = 30
TIME_STEPS = 10
INPUT_DIM = 2
NUM_CLASSES = 1
STATEFUL = True
NUM_CELLS = 20

i = Input(batch_input_shape=[BS, TIME_STEPS, INPUT_DIM], name='input_0')
c = Input(batch_input_shape=[BS, NUM_CLASSES], name='input_1')

x = ConditionalRecurrent(
    LSTM(
        64,
        batch_input_shape=(BS, TIME_STEPS, INPUT_DIM),
        activation='tanh',  # 'relu',
        return_sequences=True, stateful=STATEFUL
    ))([i, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=STATEFUL))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

model2 = Model(inputs=[i, c], outputs=[x])
print(model2.predict([np.zeros(shape=(BS, TIME_STEPS, INPUT_DIM)), np.zeros(shape=(BS, NUM_CLASSES))]).shape)

philipperemy / cond_rnn

Adding dropout layer to stacked conditional RNNs #42