keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.71k stars 19.43k forks source link

Example of Pretraining with Autoencoder? #402

Closed valexandersaulys closed 9 years ago

valexandersaulys commented 9 years ago

Specifically, is there an example of pretraining with an autoencoder for use with a convolutional neural network? If not, just a more generic pretraining example then?

Thanks!

fchollet commented 9 years ago

You can check out: https://github.com/fchollet/keras/pull/371

valexandersaulys commented 9 years ago

Is there a reason

tie_weights

is no longer implemented in the Autoencoder class for python? The docs on keras.io seem to show it still implemented, but the most recent version I just pulled from github seems to have it nixed.

valexandersaulys commented 9 years ago

Also, I noticed that when trying to implement similar code I got a strange error. I think it stems from how the autoencoder works. When creating an autoencoder for the purposes of pretraining, there is a need to make the decoder mirror the encoder. In the example you linked to, this is what happens.

My question is, how does one mirror a convolutional neural network?

fchollet commented 9 years ago

When creating an autoencoder for the purposes of pretraining, there is a need to make the decoder mirror the encoder.

Not really. Mirroring yield better computational efficiency and slightly reduced risk of overfitting, but it is not conceptually superior to having separate encoders and decoders.

My question is, how does one mirror a convolutional neural network?

Restoring an image input after convolution and maxpooling requires to perform unpooling and deconvolution. There is some literature on the subject.

Having separate encoders and decoders allows you to by-pass this (fairly non-trivial) issue by simply using a Dense layer (or a stack thereof) to reconstruct your input. It should be good enough.

valexandersaulys commented 9 years ago

For the life of me, I cannot figure out why this error would exist then:

Epoch 0
Traceback (most recent call last):
  File "conv1d_bengio_scc20x11.py", line 151, in <module>
    batch_size=BATCH_SIZE)
  File "/home/vincent/ml/keras/keras/models.py", line 371, in fit
    validation_split=validation_split, val_f=val_f, val_ins=val_ins, shuffle=shuffle, metrics=metrics)
  File "/home/vincent/ml/keras/keras/models.py", line 135, in _fit
    outs = f(*ins_batch)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 606, in __call
__                                                                                                   
    storage_map=self.fn.storage_map)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 595, in __call
__                                                                                                   
    outputs = self.fn()
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 768, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 2088, in perform
    out[0] = inputs[0].__getitem__(inputs[1:])
IndexError: index 98 is out of bounds for axis 1 with size 98
Apply node that caused the error: AdvancedSubtensor(Elemwise{Add}[(0, 0)].0, Subtensor{int64}.0, Subte
nsor{int64}.0)                                                                                       
Inputs types: [TensorType(float32, 3D), TensorType(int64, vector), TensorType(int64, vector)]
Inputs shapes: [(1, 98, 1), (100,), (100,)]
Inputs strides: [(392, 4, 4), (8,), (8,)]
Inputs values: ['not shown', 'not shown', 'not shown']

Backtrace when the node is created:
  File "/home/vincent/ml/keras/keras/models.py", line 56, in weighted
    masked_y_pred = y_pred[weights.nonzero()[:-1]]

This is just a convolutional2d+dropout+pooling encoder with a straightforward single dense layer decoder.

valexandersaulys commented 9 years ago

So when I set the filter_length to 1, it goes fine. Somewhere the length is being shortened, so I'm assuming my decoder has to take this into account....

valexandersaulys commented 9 years ago

...which means I can't use a dense layer for my decoder right? Its mentioned that padding can change the steps size for the Convolution1D, is there a way to mitigate or stop this? What about add my own padding in prior to the calculation?

valexandersaulys commented 9 years ago

Accounting for padding fixes everything, but why does Keras chop off the last n rows during autoencoding? (n is equal to the filter_length minus one). At least thats what I think its doing, the output is definitely dimensionally less than the input.

Thanks!