Closed praneet195 closed 3 years ago
Sorry for the later reply. You can replace any of the stacked LSTMCells or GRUCells with FastGRNN or FastRNN in your codes. Unless you want to induce sparsity, you wouldn't need to use the FastCell Trainer. Also, it is easy to induce sparsity even with stacked RNN cells in the same fashion as FastCellTrainer does it right now. Feel free to ask your queries here if there are any with examples so that I can give you a precise answer.
Hi, I'm unable to add these cells in a fashion similar to the say LSTMCells. For example considering the FastGRNN cell example where the cell has been defined, is there any way to return its output to another cell and hence stack them ?
Is this what you want - https://github.com/Microsoft/EdgeML/blob/master/tf/edgeml/trainer/fastTrainer.py#L73 ?
Also, if you have a specific code in use, you can share the snippet in gist and I can look at it and suggest you how to change it. I easily stack FastCells as easily as LSTM Cell.
Yes thank you, this is exactly what I needed.. is there a code snippet that you can provide stacking multiple cells as I'm running into a dimensionality issue while trying to send the output of one cell to another
Hi Praneet,
I would rather encourage you to share your case, so that I can help you. According to me, using https://www.tensorflow.org/api_docs/python/tf/nn/rnn_cell/MultiRNNCell is very straight forward for FastGRNN if your code works for GRU.
Alright, This is my case: I've created this model in keras using RNN and the LSTMCell. The model is as follows:
cell1=tf.nn.rnn_cell.LSTMCell(64)
cell2=tf.nn.rnn_cell.LSTMCell(64)
cell3=tf.nn.rnn_cell.LSTMCell(64)
cell4=tf.nn.rnn_cell.LSTMCell(64)
model = Sequential()
model.add(RNN(cell1, input_shape=(train_X.shape[1:]),return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(RNN(cell2, return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(RNN(cell3, return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(RNN(cell4, return_sequences=False))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
history = model.fit(train_X, train_y, epochs=10, batch_size=1024, validation_data=(test_X, test_y), verbose=1, shuffle=True)
However, if I substitute the LSTMCell in the above case with say the FastRNNCell or FastGRNNCell, i get the following error:
Traceback (most recent call last):
File "fasttrainer.py", line 72, in <module>
model.add(RNN(cell1, input_shape=(train_X.shape[1:]),return_sequences=True))
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\training\checkpointable\base.py", line 474, in _method_wrapper
method(self, *args, **kwargs)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 159, in add
layer(x)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent.py", line 619, in __call__
return super(RNN, self).__call__(inputs, **kwargs)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 757, in __call__
outputs = self.call(inputs, *args, **kwargs)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent.py", line 750, in call
input_length=timesteps)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\backend.py", line 3292, in rnn
swap_memory=True)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3291, in while_loop
return_same_structure)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3004, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2939, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3260, in <lambda>
body = lambda i, lv: (i + 1, orig_body(*lv))
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\backend.py", line 3277, in _step
tuple(states) + tuple(constants))
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent.py", line 737, in step
output, new_states = self.cell.call(inputs, states, **kwargs)
File "C:\Users\prane\Desktop\keras-lstm-master\edgeml\graph\rnn.py", line 291, in call
initializer=W_matrix_init)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1487, in get_variable
aggregation=aggregation)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1237, in get_variable
aggregation=aggregation)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 540, in get_variable
aggregation=aggregation)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 492, in _true_getter
aggregation=aggregation)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 861, in _get_single_variable
name, "".join(traceback.format_list(tb))))
ValueError: Variable FastRNN/FastRNNcell/W already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3274, in create_op
op_def=op_def)
File "d:\Users\prane\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
It says that the FastRNNCell already exists, disallowed as seen above. Any reason why ?
Hi Praneet,
It is evident from the logs that the scope is the issue. Try using this in the declaration and let me know if it fails. Also, do you expect all the cells to have different weights or do you expect them to be coupled?
cell1=FastRNNCell(64, name="FastGRNNCell1") cell2=FastRNNCell(64, name="FastGRNNCell2") cell3=FastRNNCell(64, name="FastGRNNCell3") cell4=FastRNNCell(64, name="FastGRNNCell4")
Hi Aditya, Firstly, thank you for the prompt replies. I'm getting the following error after trying your fix.
ValueError: Variable FastGRNNCell1/FastRNNcell/W already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?
Not sure why the scope issue is arising. Also, I expect them to have different weights.
@praneet195 , I have tested your code of LSTMs and it should fail with the same error as you shared in tensorflow unless you have been using a different version of tf. It seems to run fine in keras for some reason.
cell1=FastGRNNCell(64, name="FastGRNNCell1") cell2=FastGRNNCell(64, name="FastGRNNCell2") cell3=FastGRNNCell(64, name="FastGRNNCell3") cell4=FastGRNNCell(64, name="FastGRNNCell4")
should fix the issue (it is same as the earlier stuff) in tensorflow, unless there is something which isn't shared with me yet. I need to look back on my keras codes when I codes FastGRNN for Keras and see what is the difference
I think it is better if you could mail your entire code and any test data to me . I don't understand why it is failing even after scope resolution. I have written multiple test scripts in tf to test this case and I saw full resolution once clearly dis-ambiguated.
Update: Keras seems to be doing something weird in my test scripts. It isn't failing where yours is. It is failing for some internal change of shape (which isn't happening) but keras thinks it is happening. I think we better take tis offline with actual working code and data and then debug.
Sure, I will drop a mail on your official id and let's take this offline.
I have similar problem as it is described above. To make it work in Keras, i did following changes:
In original class FastGRNNCell(RNNCell)
i changed declaration and first line in function call() to following
def call(self, inputs, states):
state = states[0]
Also I wrapped this class to Keras layer.
class FastGRNNCellWrapper(Layer):
def __init__(self, units, **kwargs):
self.units = units
self.state_size = units
super(FastGRNNCellWrapper, self).__init__(**kwargs)
self.lstm = FastGRNNCell(units, **kwargs)
def call(self, inputs, states, training=None):
return self.lstm(inputs, states)
Than it start to work, but result is really bad. It will be great if somebody can share implementation in Keras.
@praneet195 If I remember correctly, we were able to debug this offline. Can you please share the code in that case?
Maybe my model also will be useful. To reproduce the issue, just use FastGRNNCell instead of FastGRNNCellWrapper.
timesteps = 50
data_dim = 128
num_classes = 1
model = Sequential()
# 1D ConvNet
model.add(Reshape((timesteps, data_dim, 1), input_shape=(timesteps, data_dim,)))
model.add(TimeDistributed(Conv1D(filters=data_dim, kernel_size=32, padding='same', name='conv1')))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(Activation('relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=data_dim - 32 + 1)))
model.add(TimeDistributed(Dropout(0.3)))
# RNN-LSTM
model.add(Reshape((timesteps, data_dim,)))
model.add(RNN(FastGRNNCellWrapper(data_dim, name='lstm1'), name='lstm1', return_sequences=True))
model.add(Dropout(rate=0.3, name='drop1'))
model.add(RNN(FastGRNNCellWrapper(data_dim, name='lstm2'), name='lstm2', return_sequences=True))
model.add(Dropout(rate=0.3, name='drop2'))
# Fully connected layer
model.add(TimeDistributed(Dense(data_dim, activation='linear', kernel_initializer='VarianceScaling', name='fc3')))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(Activation('relu', name='relu3')))
model.add(TimeDistributed(Dropout(rate=0.3, name='drop3')))
# Output layer
model.add(TimeDistributed(Dense(num_classes, activation='sigmoid', name='output')))
lWrapp I have test your code, but the weights in FastGRNN have not been added to model.trainable_weights It seems the params is not trainable when use FastGRNNCellWrapper directly in Keras. Do you have this problem? @DmitryKhlus
I have similar problem as it is described above. To make it work in Keras, i did following changes:
In original
class FastGRNNCell(RNNCell)
i changed declaration and first line in function call() to followingdef call(self, inputs, states): state = states[0]
Also I wrapped this class to Keras layer.
class FastGRNNCellWrapper(Layer): def __init__(self, units, **kwargs): self.units = units self.state_size = units super(FastGRNNCellWrapper, self).__init__(**kwargs) self.lstm = FastGRNNCell(units, **kwargs) def call(self, inputs, states, training=None): return self.lstm(inputs, states)
Than it start to work, but result is really bad. It will be great if somebody can share implementation in Keras.
The reason why result is that bad maybe is the we have not add these weights in FastGRNN int Keras model and make it a trainable weights. In my experiment ,I can't find FastGRNN's weights in model.trainable_weithts
@dileep3004 , I have very little clue about the latest Keras versions and I do not have the setup to reproduce this. A simple thing to do is to check if the same error persists with other Cells like GRULRCell etc., The error seems weird and my Keras implementations are very old. May be @praneet195 or @shipleyxie can look at this as they seem to have figured out the keras stuff :).
@DmitryKhlus sorry I lost track of this issue because it shows as closed. As mentioned about I have very little clue about keras now a days.
You need to change the cells as well. I don't know what RNN(cell) or LSTM(cell) does in keras. At least in the current code, it seems like you are creating an LSTMLRCell and passing it to RNN().
Maybe pointing me to what RNN() or LSTM() do, might help. But when I asked to changed the cells, it is about changing cell1, cell2, cell3.
Also, please remember we haven't written this code with TF 2.x in mind, so I don't know any compatibility issues with that. But, from what I can see the code you just posted will not work due to multiple reasons which are not even related to the internals of EdgeML
Okk . I will debug it.
Can you point me to some resources to include FastCells in Keras Models.
This thread contains some information. FastCells are exactly like GRUCell or LSTMCell of native Tensorflow. The incorporation of FastCells follows almost the same path as the other two.
Sorry, that I couldn't be of more help. You can always use TF 1.x and use the fastcell_example as a starting point and then use tf.MultiCellRNN (not sure about syntax). to get multiple layers of FastCells.
I just modified to FastGRNN/FastRNN cell for Keras implementation from original code. https://github.com/yunishi3/FastGRNN-for-Keras
@yunishi3 thanks a ton. I will go through the implementation just to be safe.
I think your implementation doesn't support training to induce sparsity. Can you please add that to readme as well. The functional implementation of the cells is exactly the tf code, so nothing to worry there :)
@harsha-simhadri do we want to have keras cells in EdgeML? If so, this can be a starting point. I have implemented keras cells in the older static graph versions and I am not sure if that makes sense now.
@yunishi3 can you also check stacking of multiple of these cells? There was an error in scope the last time I checked it in Keras.
@adityakusupati Thank you for checking my repository. I just added the note in terms of sparsity inducing in the README.md.
As far as stacking of multiple of these cells, following code worked in the fastcell_example_keras.ipynb.
(Sorry, I forgot this thread was about stacking cells...)
FastCell = FastGRNNCellKeras(hiddenDims)
FastCell_1 = FastGRNNCellKeras(hiddenDims)
FastCell_2 = FastGRNNCellKeras(hiddenDims)
~~
x = RNN(FastCell, return_sequences=True, name='rnn')(x)
x = RNN(FastCell_1, return_sequences=True, name='rnn1')(x)
x = RNN(FastCell_2, return_sequences=False, name='rnn2')(x)
Thanks @yunishi3 . This is very helpful.
Is there any way of providing more cells to the FastCellTrainer or does it only work with one cell?