Pre-training and fine-tuning deep autoencoders

hakaseren commented 10 years ago

Dear Theano-nets users,

Thanks to your support, I could slowly but surely progress in my experimentations with NN.

Now, I'm trying to work with deep autoencoders, starting with pre-training stacked auto-encoders followed by fine-tuning the resulting network with a linear regression layer on top of it.

In Theano-nets, there is a deepautoencoder sample code. However, I am not sure if the pre-training and fine-tuning steps are performed in the code or if it has to be specifically written.

Also, is there any option to train deep-autoencoders and to use the resulting network with the provided recurrent neural network code?

Again, thank you very much for your help.

H.R.

kastnerkyle commented 10 years ago

Is there a paper that shows that pretraining autoencoders actually works? I have only seen autoencoders used for pretraining a deep classifier. I have never tried it, but I do know that doing one without any prior knowledge is very time consuming. If the deep autoencoder example is the one I worked on, then no, it does not have any pretraining done - just raw learning from the data with very careful selection of hyperparameters. If there is some theory behind pretraining an deep autoencoder with one layer autoencoders to get easier convergence I am very, very interested!

Also, what were you planning to do with the trained autoencoder and the recurrent network? I am pretty sure it is doable if the theory is there (may need to add some code), but I have never seen any applications of that and am very curious!

hakaseren commented 10 years ago

Hello, thank you for your answer. My idea to train auto-encoders follows some results published in the paper entitled "Greedy layer-wise training of deep networks" (2007), by Yoshua Bengio et al. But you are right, it is not clearly demonstrated that it can work for tasks other than classification. With one layer at a time trained autoencoders, I wanted to make a comparison with RBM-RNN.

lmjohns3 commented 10 years ago

The "layerwise" trainer implements the strategy described by Bengio et al (2007) -- you might look at the source code in trainer.py to see what it does. I personally haven't found it to be that effective, but probably I haven't spent enough time tuning the learning parameters. Instead, though, I've found that alternative unit activations like relu allow for an entire deep network to be trained in one go.

I'm not sure whether there would be a way to pretrain RNNs with autoencoders, since the recurrent structure of an RNN incorporates a completely new weight matrix that's not present in an autoencoder. But, you could try comparing the performance of the two types of networks on a particular task?

hakaseren commented 10 years ago

Thank you very much! The layerwise trainer is exactly as in the paper indeed. I have tried the code:

e = theanets.Experiment(theanets.feedforward.Regressor,
                    layers=(600, 500, 400, 300, 200, 50),
                    learning_rate=0.001,
                    optimize='layerwise+hf',
                    activation='tanh',
                    hidden_l2=0.0001,
                    batch_size=64,
                    train_batches=64, valid_batches=64)                    
e.run(train_set, valid_set)

However, it has triggered the following error message:

Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 540, in runfile execfile(filename, namespace) File "FFNN.py", line 76, in e.run(train_set, valid_set) File "C:\Python27\lib\site-packages\theanets\main.py", line 198, in run cg_set=self.datasets['cg']) File "C:\Python27\lib\site-packages\theanets\trainer.py", line 204, in train trainer.train(train_set, valid_set=valid_set, **kwargs) File "C:\Python27\lib\site-packages\theanets\trainer.py", line 256, in train bs = len(first(train_set.minibatches[0])) AttributeError: 'SequenceDataset' object has no attribute 'minibatches'

How may I specify minibatches for this code to run?

Finally, may I ask if there is a possibility to save and load the parameters of a network in order to resume training or import external weights?

Thank you again for all your help and for building this very good package.

lmjohns3 commented 10 years ago

Hm, I think the traceback you've posted is from an older version of the layerwise trainer code. Try updating your theanets package (I just uploaded a new version for pip), or you could also use the github code directly if you're feeling adventurous.

As for the saving and loading -- you can save a network to a pickle file, and load a saved pickle file into a network. However, there's not currently a way to load the weights from some other source (e.g. a npy file), unless you wrote something to do it by hand. This actually wouldn't be too difficult, you'd just need to do something like

network.weights[0].set_value(some_numpy_array)

which would set the weights connecting the input units to the first layer of hidden units.

hakaseren commented 10 years ago

Thank you very much for the advice and for the update. It now works very well.

However, it might be confined to my own working environment, but I now encounter some problems when putting more than one trainer in the 'optimize' parameter. I understand that instead of "+", a whitespace should be used to separate the names but by doing so, I receive a KeyError message. Should I revise the way I call the trainers?

lmjohns3 commented 10 years ago

Yes, if you provide a programmatic value for the optimize parameter, make it a list or tuple of strings:

Experiment(..., optimize=['layerwise', 'hf'])

Really we should make the code split out the names for you if you provide one string, but hopefully providing a list isn't too onerous. :)

hakaseren commented 10 years ago

Thanks a million for your assistance!

lmjohns3 commented 10 years ago

Sure thing!

By the way, I just checked in a change to Experiment that will let you provide one string for the "optimize" parameter -- it will be included in the next pip release.

lmjohns3 / theanets

Pre-training and fine-tuning deep autoencoders #12