lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 73 forks source link

Overcomplete basis with autoencoder with tied_weights=True leads to total sparsity in decode? #6

Closed kastnerkyle closed 10 years ago

kastnerkyle commented 11 years ago

This gist sums up what I am seeing. When I try to do an overcomplete autoencoder, the sparsity for all decode layers shows up as 1. and the cost gets "stuck" at ~87 (because the gradient can't flow backwards with totally sparse layers?)

I encountered this while trying to build the canonical 784-1000-500-250-30-350-500-1000-784 deep autoencoder for MNIST digits - didn't have time to explore or recreate til now. Any thoughts?

lmjohns3 commented 10 years ago

Wow, this is an old one -- sorry for letting it drop!

I don't really have anything insightful to say about this, except that the example you contributed of the deep autoencoder seems to work fairly well at the moment, even for logistic activations and training with sgd. My goto solutions for deep networks that don't seem to train well all at once is to reach for "relu" activations (instead of sigmoid), and then to try using a higher-order trainer, like the CG or HF trainers.

kastnerkyle commented 10 years ago

Thanks - I am pretty sure this is less "bug in code" and more "mathematical limitation in neural networks". Closing!