Hello,
I have used
self.intra_rnn = keras.layers.Bidirectional(keras.layers.CuDNNLSTM(units=self.numUnits//2, return_sequences=True))
and
self.inter_rnn = keras.layers.CuDNNLSTM(units=self.numUnits, return_sequences=True)
during the training process, which significantly increases the training speed.
However, is it possible to use LSTM using inference to run on cpu?
If so, what would be the LSTM parameters? Using the below may give different result compared to CuDNNLSTM?
keras.layers.LSTM(units=self.numUnits//2, return_sequences=True,implementation = 2,recurrent_activation = 'hard_sigmoid')
In TenosrFlow 2.X the requirements to use the cuDNN implementation are:
activation == tanh
recurrent_activation == sigmoid
recurrent_dropout == 0
unroll is False
use_bias is True
Inputs are not masked or strictly right padded.
implementation == 2,
I haven't compared the code of LSTM in tf1 with tf2. Maybe they are a little different. If you use the above parameters, be careful of the alignment when loading the weights.
Hello, I have used self.intra_rnn = keras.layers.Bidirectional(keras.layers.CuDNNLSTM(units=self.numUnits//2, return_sequences=True)) and self.inter_rnn = keras.layers.CuDNNLSTM(units=self.numUnits, return_sequences=True) during the training process, which significantly increases the training speed.
However, is it possible to use LSTM using inference to run on cpu?
If so, what would be the LSTM parameters? Using the below may give different result compared to CuDNNLSTM? keras.layers.LSTM(units=self.numUnits//2, return_sequences=True,implementation = 2,recurrent_activation = 'hard_sigmoid')