rasmusbergpalm / DeepLearnToolbox

Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoencoders and vanilla Neural Nets. Each method has examples to get you started.
BSD 2-Clause "Simplified" License
3.8k stars 2.28k forks source link

random vs. zero initialization of weights #31

Open gallamine opened 11 years ago

gallamine commented 11 years ago

Perhaps I'm mistaken, but it seems you initialize (the DBN) weights to all zeros before training, rather than doing a random weight initialization. Is there a reason for this? It seems from my reading that random weight initialization's would be best.

rasmusbergpalm commented 11 years ago

Can you point me to a reference. If you're right and can prove it then I'll happily accept a PR to change it

rasmusbergpalm commented 11 years ago

?

gallamine commented 11 years ago

My comment was in reference to Hinton's paper on A Practical Guide to Training Restricted Boltzmann Machines, though I seem to recall a Bengio paper where he didn't do it, so I'm not sure whether it's necessary or not.

Section 8 says:

The weights are typically initialized to small random values chosen from a zero-mean Gaussian with a standard deviation of about 0:01. Using larger random values can speed the initial learning, but it may lead to a slightly worse final model. Care should be taken to ensure that the initial weight values do not allow typical visible vectors to drive the hidden unit probabilities very close to 1 or 0 as this signi significantly slows the learning.

rasmusbergpalm commented 11 years ago

Alright. Send a PR and i'll accept it!