Closed devonlee111 closed 6 years ago
It seems this problem wasn't really an implementation problem at all. It seems that the real issue was using the sigmoid function as an activation function. When using sigmoid on a neural net with more than a single hidden layer, the neural net exhibited a severe case of, what I believe to be, vanishing gradients. I have solved this problem by implementing the option of using tanh as the activation function. Using tanh seems to have removed the vanishing gradients problem when using multiple hidden layers.
Problem
The NN.py program does not train properly when using 2 or more hidden layers.Possible Errors
The cause of this problem is currently unknown.Methods To Find A Solution
More research will be done on the backpropagation function to see what may have been implemented incorrectly. A review of the code will be performed in case it turns out to be a minor algorithm error.