It just came to my mind that the Sigmoid function (that I had included as the activation function for all the hidden layers) probably deals poorly with exclusively positive data (from the input layer) since it is antisymmetric around 0 and transforms values [-inf,+inf] to [0,1].
I think we'll have to dig a bit to discover the best activation functions for each layer.
It just came to my mind that the Sigmoid function (that I had included as the activation function for all the hidden layers) probably deals poorly with exclusively positive data (from the input layer) since it is antisymmetric around 0 and transforms values [-inf,+inf] to [0,1].
I think we'll have to dig a bit to discover the best activation functions for each layer.