Closed DaStapo closed 3 years ago
I assume this is not supposed to happen, but I checked the model's parameters after training and these were the values from the final IndRNN layer in my model: rnn2.bias Parameter containing: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)
rnn2.bias Parameter containing: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)
This is my module: self.rnn = haste.IndRNN(input_dim, hidden_dim, batch_first=True, zoneout=0.1, return_state_sequence=True) self.rnn2 = haste.IndRNN(hidden_dim, 64, batch_first=True, zoneout=0.075) self.d1 = nn.Dropout(0.15)
self.rnn = haste.IndRNN(input_dim, hidden_dim, batch_first=True, zoneout=0.1, return_state_sequence=True) self.rnn2 = haste.IndRNN(hidden_dim, 64, batch_first=True, zoneout=0.075) self.d1 = nn.Dropout(0.15)
And forward function: out, (hn) = self.rnn(x) out, (hn) = self.rnn2(self.d1(out))
out, (hn) = self.rnn(x) out, (hn) = self.rnn2(self.d1(out))
Nevermind, I didn't include the final layer to the optimizer.
I assume this is not supposed to happen, but I checked the model's parameters after training and these were the values from the final IndRNN layer in my model:
rnn2.bias Parameter containing: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)
This is my module:
self.rnn = haste.IndRNN(input_dim, hidden_dim, batch_first=True, zoneout=0.1, return_state_sequence=True) self.rnn2 = haste.IndRNN(hidden_dim, 64, batch_first=True, zoneout=0.075) self.d1 = nn.Dropout(0.15)
And forward function:
out, (hn) = self.rnn(x) out, (hn) = self.rnn2(self.d1(out))