Closed DennisCraandijk closed 5 years ago
Hello Dennis,
I hardcoded the activation function as tanh just to experiment with it but my final aim is to make the activation function a parameter in the constructor of the MLP. As soon as I have time I will make this change.
Ok great, just checking if there was any other reason.
Hi Andrea, Thanks for this neat implementation!
I've noticed the MLP consists of linear layers followed by a tanh layer (code). However, in the RRN paper the authors mention using ReLu layers followed by a linear layer. Is this variation intentional?