lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 73 forks source link

Adding sparsity constraint to autoencoder cost #5

Closed kastnerkyle closed 11 years ago

kastnerkyle commented 11 years ago

Is there currently any way to add a sparsity constraint to the cost of an autoencoder? I see regularization terms (weight_l1, l2, etc.) but another tutorial also mentions an explicit sparsity term. Looking at it, it seems (to me at least) different than regularization based on the weights. However, other deep learning notes don't seem to have this parameter, at least not in the form shown in the link.

If we do need this functionality, I am thinking a separate SparseAutoencoder class might be better than adding construction options to the current autoencoder - what are your thoughts?

lmjohns3 commented 11 years ago

I think what you're referring to is a sparsity constraint on the activations of the hidden units:

loss = || V g(W x) - x ||_2 + a || g(W x) ||_1

where W are the encoding weights, V are the decoding weights, and the second term implements a penalty on the hidden-unit activations.

If that's what you're referring to, use the --hidden-l1 command-line flag to control the value of a.

It might make more sense to newcomers to have a SparseAutoencoder subclass, but that still wouldn't solve the problem of how to set the a parameter. Any thoughts ?

kastnerkyle commented 11 years ago

This is exactly what I was looking for - just missed it because I had my blinders on! I kept looking at the Autoencoder cost directly, rather than the J shared by all nets. Oops.

As far as the a parameter is concerned (and parameters in general), I have been looking at two papers from ICML 2013 - No More Pesky Learning Rates and On The Importance of Initialization and Momentum in Deep Learning . Maybe implementing this will give some ideas?

lmjohns3 commented 11 years ago

I actually just implemented NAG from the second paper. :) I've been waiting to check it in until we get this cascaded trainer thing merged.

I think I read the learning rates paper but don't remember how to do it. I'll give the paper another look this week.

Going to close this one out.

kastnerkyle commented 11 years ago

Glad to hear you implemented NAG! It looks really promising.