lmnt-com / haste

Haste: a fast, simple, and open RNN library
Apache License 2.0
325 stars 27 forks source link

Biases in final IndRNN layer are 0 #34

Closed DaStapo closed 3 years ago

DaStapo commented 3 years ago

I assume this is not supposed to happen, but I checked the model's parameters after training and these were the values from the final IndRNN layer in my model: rnn2.bias Parameter containing: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)

This is my module: self.rnn = haste.IndRNN(input_dim, hidden_dim, batch_first=True, zoneout=0.1, return_state_sequence=True) self.rnn2 = haste.IndRNN(hidden_dim, 64, batch_first=True, zoneout=0.075) self.d1 = nn.Dropout(0.15)

And forward function: out, (hn) = self.rnn(x) out, (hn) = self.rnn2(self.d1(out))

DaStapo commented 3 years ago

Nevermind, I didn't include the final layer to the optimizer.