rasmusbergpalm / DeepLearnToolbox

Matlab/Octave toolbox for deep learning. Includes Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoencoders and vanilla Neural Nets. Each method has examples to get you started.
BSD 2-Clause "Simplified" License
3.8k stars 2.28k forks source link

SAE weights #106

Open mnaetr opened 10 years ago

mnaetr commented 10 years ago

Hi. Does anyone know how to get weights like these? I'm trying to follow this tutorial

http://ufldl.stanford.edu/wiki/index.php/Visualizing_a_Trained_Autoencoder

and to get the same results with this code, but I can not. Does anyone have any advice or idea? Thanks in advance examplesparseautoencoderweights

rasmusbergpalm commented 10 years ago

I've gotten weights like that. Use sparsity.

mnaetr commented 10 years ago

Ok. How can I correctly choose the values of the different hyperparameters (learningRate, sparsityTarget, nonSparsityPenalty and weightPenalty)?

rasmusbergpalm commented 10 years ago

Cross validation

On 15/07/2014, at 14.58, mnaetr notifications@github.com wrote:

Ok. How can I correctly choose the values of the different hyperparameters (learningRate, sparsityTarget, nonSparsityPenalty and weightPenalty)?

— Reply to this email directly or view it on GitHub https://github.com/rasmusbergpalm/DeepLearnToolbox/issues/106#issuecomment-49027423 .

Tgaaly commented 10 years ago

I'm having the same problem with the DBN example. The 2nd layer weights learnt by the DBN are pretty much garbage. The 1st layer is fine. Unlike Autoencoders there doesnt seem to be many options to set with the DBN. Any information about how to obtain good 2nd layer weights would be greatly appreciated.

yu239-zz commented 10 years ago

First Make sure you didn't visualize your second layer with just weight matrix because that is not correct. By doing this, you ignored the nonlinear activation function. If you want to optimize higher layer, you have to solve optimization problems that try to maximize the activation for each hidden unit each time, given the learned parameters.

yu239-zz commented 10 years ago

Visualizing higher layers is more difficult and does not have an analytic solution like the first layer (which you can derive in several minutes). For a reference, please see the 'google cat paper' (google 'google cat paper') and look at how they visualize high layers in the experiment.