Closed NarendraPatwardhan closed 6 years ago
The IDENTITY
function will not converge on the XOR dataset because it's a linear activation function. XOR dataset is a non-linear problem. Its gradient is also always 1
so weights/biases will converge to the same point, giving a static final error.
Identity activation function should not be used on it's own on most problems, but combined with other activation functions.
Why must a nonlinear activation function be used in a backpropagation neural network?
First of all thank you for all the amazing work put in the library. I am exploring the effect of hyperparameters and activation functions on a simple XOR network currently. While training a simple XOR network and calculating the rms error using different activation functions, all functions give differing results within a certain range except for identity function which gives a constant error. The code I have used is as follows (have added pow =math.pow, neuron=synaptic.neuron etc): ` var inputLayer = new Layer(2); var hiddenLayer = new Layer(3); var outputLayer = new Layer(1); inputLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here })
hiddenLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here }) outputLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here })
inputLayer.project(hiddenLayer); hiddenLayer.project(outputLayer);
var myNetwork = new Network({ input: inputLayer, hidden: [hiddenLayer], output: outputLayer });
var learningRate = .3; for (var i = 0; i < 20000; i++) { // 0,0 => 0 myNetwork.activate([0,0]); myNetwork.propagate(learningRate, [0]);
}
var err1= 0 - myNetwork.activate([0,0]); var err2 = 1 - myNetwork.activate([0,1]); var err 3 = 1 - myNetwork.activate([1,0]); var err4 = 0 - myNetwork.activate([1,1]); var err = pow((pow(err1,2)+pow(err2,2)+pow(err3,2)+pow(err4,2)),0.5) console.log(err)`
while Tanh and Relu gives ~10^4 and 10^3 errors with differing values, Identity gives 1.4142135623730950488 ( sqrt 2).