cazala / synaptic

architecture-free neural network library for node.js and the browser
http://caza.la/synaptic
Other
6.91k stars 666 forks source link

Static Output for Identity Function #243

Closed NarendraPatwardhan closed 6 years ago

NarendraPatwardhan commented 7 years ago

First of all thank you for all the amazing work put in the library. I am exploring the effect of hyperparameters and activation functions on a simple XOR network currently. While training a simple XOR network and calculating the rms error using different activation functions, all functions give differing results within a certain range except for identity function which gives a constant error. The code I have used is as follows (have added pow =math.pow, neuron=synaptic.neuron etc): ` var inputLayer = new Layer(2); var hiddenLayer = new Layer(3); var outputLayer = new Layer(1); inputLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here })

hiddenLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here }) outputLayer.set({ squash: Neuron.squash.TANH, //changining activation functions here })

inputLayer.project(hiddenLayer); hiddenLayer.project(outputLayer);

var myNetwork = new Network({ input: inputLayer, hidden: [hiddenLayer], output: outputLayer });

var learningRate = .3; for (var i = 0; i < 20000; i++) { // 0,0 => 0 myNetwork.activate([0,0]); myNetwork.propagate(learningRate, [0]);

// 0,1 => 1
myNetwork.activate([0,1]);
myNetwork.propagate(learningRate, [1]);

// 1,0 => 1
myNetwork.activate([1,0]);
myNetwork.propagate(learningRate, [1]);

// 1,1 => 0
myNetwork.activate([1,1]);
myNetwork.propagate(learningRate, [0]);

}

var err1= 0 - myNetwork.activate([0,0]); var err2 = 1 - myNetwork.activate([0,1]); var err 3 = 1 - myNetwork.activate([1,0]); var err4 = 0 - myNetwork.activate([1,1]); var err = pow((pow(err1,2)+pow(err2,2)+pow(err3,2)+pow(err4,2)),0.5) console.log(err)`

while Tanh and Relu gives ~10^4 and 10^3 errors with differing values, Identity gives ‎1.4142135623730950488 ( sqrt 2).

wagenaartje commented 7 years ago

The IDENTITY function will not converge on the XOR dataset because it's a linear activation function. XOR dataset is a non-linear problem. Its gradient is also always 1 so weights/biases will converge to the same point, giving a static final error.

Identity activation function should not be used on it's own on most problems, but combined with other activation functions.

Why must a nonlinear activation function be used in a backpropagation neural network?