Open gallamine opened 11 years ago
Can you point me to a reference. If you're right and can prove it then I'll happily accept a PR to change it
?
My comment was in reference to Hinton's paper on A Practical Guide to Training Restricted Boltzmann Machines, though I seem to recall a Bengio paper where he didn't do it, so I'm not sure whether it's necessary or not.
Section 8 says:
The weights are typically initialized to small random values chosen from a zero-mean Gaussian with a standard deviation of about 0:01. Using larger random values can speed the initial learning, but it may lead to a slightly worse final model. Care should be taken to ensure that the initial weight values do not allow typical visible vectors to drive the hidden unit probabilities very close to 1 or 0 as this signisignificantly slows the learning.
Alright. Send a PR and i'll accept it!
Perhaps I'm mistaken, but it seems you initialize (the DBN) weights to all zeros before training, rather than doing a random weight initialization. Is there a reason for this? It seems from my reading that random weight initialization's would be best.