keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.65k stars 19.42k forks source link

Keras autoencoder accuracy/loss doesn't change #1476

Closed haoqi closed 7 years ago

haoqi commented 8 years ago

I have the same problems, and I just found the question from stackoverflow so I just copy this link https://bim360field.cn/questions/34660337/keras-autoencoder-accuracy-loss-doesnt-change anyone has any ideas?

vsuriya93 commented 8 years ago

I am facing a similar problem. The accuracy is not at all improving.. It is predicting all the classes as same. I am working on using deep learning for image classification, in CIFAR data set. Here is the model code I am using.

model=Sequential() model.add(Convolution2D(32,3,3,border_mode='same',input_shape=(3,32,32))) model.add(Activation('relu')) model.add(Convolution2D(32, 3, 3)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Convolution2D(64, 3, 3, border_mode='same')) model.add(Activation('relu')) model.add(Convolution2D(64, 3, 3)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(512)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(nb_classes)) model.add(Activation('softmax')) sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) model.compile(loss='categorical_crossentropy', optimizer=sgd) # error can also be mse

I have binarized the labels, their respective shapes are train -> (40000,3,32,32) test -> (10000,3,32,32)

total images =50000 I have even changed the learning rate between (.1,.0001) but all the instances are predicted into the same class What can be the possible problem.

around1991 commented 8 years ago

Try this for the AutoEncoder issue: https://groups.google.com/forum/#!topic/keras-users/Jt9bpvbxV_o

Kris

haoqi commented 8 years ago

This does not work for me, it shows like: TypeError: add() got an unexpected keyword argument 'input_dim'

around1991 commented 8 years ago

Try input_shape and wrap your input shape into a tuple instead (putting it on here as well for search purposes).

vsuriya93 commented 8 years ago

I am not getting it @around1991 . Can you please tell where and how to modify ?

around1991 commented 8 years ago

Sorry, my original diagnosis was incomplete. Setting the input_shape keyword will call build() too early, and won't solve the issue. For now, you'll have to manually call AutoEncoder.build() after adding it as the first layer of your network for it to do the right thing.

SnowRipple commented 8 years ago

Now it fails completely: (Shape of training data is (1000, 1156))

encoder = containers.Sequential([Dense(512,input_dim=1156, activation='sigmoid')])

decoder = containers.Sequential([Dense(1156,input_dim=512)])

autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, 
                    output_reconstruction=False))
#autoencoder.layers[0].build()
autoencoder.build()  

autoencoder.compile(loss='mean_squared_error', optimizer='adam')

autoencoder.fit(X_train, X_train, nb_epoch=epochs, batch_size=batchSize,
           validation_split=0.1, show_accuracy=True,verbose=2
           )

Using Theano backend. Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled) Shape of training data is (1000, 1156) Train on 900 samples, validate on 100 samples Epoch 1/5 Traceback (most recent call last): File "/home/piotr/workspace/vessel_segmentation/drive_pylearn2/autoencoder.py", line 456, in standard_autoencoder() File "/home/piotr/workspace/vessel_segmentation/drive_pylearn2/autoencoder.py", line 410, in standard_autoencoder validation_split=0.1, show_accuracy=True,verbose=2 File "build/bdist.linux-x86_64/egg/keras/models.py", line 583, in fit File "build/bdist.linux-x86_64/egg/keras/models.py", line 256, in _fit File "build/bdist.linux-x86_64/egg/keras/models.py", line 310, in _test_loop File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 385, in call File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in call outputs = self.fn() ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1156, but the output's size on that axis is 512. Apply node that caused the error: GpuElemwise{Composite{(scalar_sigmoid(i0) - i1)}}[(0, 0)](GpuElemwise{Add}[%280, 0%29].0, GpuFromHost.0) Toposort index: 16 Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)] Inputs shapes: [(100, 512), (100, 1156)] Inputs strides: [(512, 1), (1156, 1)] Inputs values: ['not shown', 'not shown'] Outputs clients: [[GpuCAReduce{pre=sqr,red=add}{0,1}(GpuElemwise{Composite{(scalar_sigmoid(i0) - i1)}}[(0, 0)].0)]]

Debugprint of the apply node: GpuElemwise{Composite{(scalar_sigmoid(i0) - i1)}}[(0, 0)] [id A] <CudaNdarrayType(float32, matrix)> ''
|GpuElemwise{Add}[(0, 0)] [id B] <CudaNdarrayType(float32, matrix)> ''
| |GpuDot22 [id C] <CudaNdarrayType(float32, matrix)> ''
| | |GpuFromHost [id D] <CudaNdarrayType(float32, matrix)> ''
| | | |<TensorType(float32, matrix)> [id E] <TensorType(float32, matrix)> | | |<CudaNdarrayType(float32, matrix)> [id F] <CudaNdarrayType(float32, matrix)> | |GpuDimShuffle{x,0} [id G] <CudaNdarrayType(float32, row)> ''
| |<CudaNdarrayType(float32, vector)> [id H] <CudaNdarrayType(float32, vector)> |GpuFromHost [id I] <CudaNdarrayType(float32, matrix)> ''
|<TensorType(float32, matrix)> [id J] <TensorType(float32, matrix)>

Storage map footprint:

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.

SnowRipple commented 8 years ago

Does anyone have an actual autoencoder example working with Keras? It's been days/weeks now since it's broken...

around1991 commented 8 years ago

You need to call autoencoder.layers[0].build(), not autoencoder.build(). The build method is on the AutoEncoder instance, not the Sequential instance.

SnowRipple commented 8 years ago

I tried autoencoder.layers[0].build() (it is commented out in my code but I tried it as well)

It is still failing

SnowRipple commented 8 years ago

Can you provide me with the simplest autoencoder example that works for you (e.g. with mnist data) so I can try run it locally, maybe my Keras installation is not right?

around1991 commented 8 years ago
from utils import StackedAutoEncoder                                                                                                                                                                                
import pdb                                                                                                                                                                                                          

from keras.layers.core import Dense, AutoEncoder                                                                                                                                                                    
from keras.layers import containers                                                                                                                                                                                 
from keras.models import Sequential                                                                                                                                                                                 
from keras.datasets import mnist                                                                                                                                                                                    
from keras.utils import np_utils                                                                                                                                                                                    

batch_size = 64                                                                                                                                                                                                     
nb_classes = 10                                                                                                                                                                                                     
nb_epochs = 2                                                                                                                                                                                                       
hidden_layers = [784, 600, 500, 400]                                                                                                                                                                                

(X_train, y_train), (X_test, y_test) = mnist.load_data()                                                                                                                                                            
X_train = X_train.reshape(-1, 784)                                                                                                                                                                                  
X_test = X_test.reshape(-1, 784)                                                                                                                                                                                    
X_train = X_train.astype('float32') / 255.0                                                                                                                                                                         
X_test = X_test.astype('float32') / 255.0                                                                                                                                                                           

Y_train = np_utils.to_categorical(y_train, nb_classes)                                                                                                                                                              
Y_test = np_utils.to_categorical(y_test, nb_classes)                                                                                                                                                                

# A simple one-layer autoencoder model                                                                                                                                                                              
print 'Compiling and fitting simple AE'                                                                                                                                                                             
simple_ae = Sequential()                                                                                                                                                                                            
encoder = Dense(input_dim=784, output_dim=400,                                                                                                                                                                      
                W_regularizer='l2', activation='tanh')                                                                                                                                                              
decoder = Dense(input_dim=400, output_dim=784,                                                                                                                                                                      
                W_regularizer='l2', activation='tanh')                                                                                                                                                              
simple_ae.add(AutoEncoder(encoder=encoder,                                                                                                                                                                          
                          decoder=decoder,                                                                                                                                                                          
                          output_reconstruction=False))                                                                                                                                                             
simple_ae.layers[0].build()                                                                                                                                                                                         
simple_ae.compile(loss='mse', optimizer='rmsprop')                                                                                                                                                                  
simple_ae.fit(X_train, X_train, batch_size=batch_size, nb_epoch=nb_epochs)  

works for me. Interestingly enough, I can't get autoencoders with containers as the encoder and decoder to learn anything either, although they at least compile

SnowRipple commented 8 years ago

Thanks for this!

Yes, your code works. However I need multiple layers, and containers are failing as you mentioned.

Interestingly enough, when I compared your model to mine the part that causes trouble is validation split. When added it to your code it failed with my error from before:

simple_ae.fit(X_train, X_train, batch_size=batch_size, validation_split=0.1, nb_epoch=nb_epochs)

alueError: GpuElemwise. Input dimension mis-match. Input 2 (indices start at 0) has shape[1] == 784, but the output's size on that axis is 400. Apply node that caused the error: GpuElemwise{Composite{(tanh((i0 + i1)) - i2)}}[(0, 0)]

So no validation for autoencoders then?

SnowRipple commented 8 years ago

Update: Multiple containers work as well (you used rmsprop maybe it resuiqred more time to converge, adam converges nice for my data).

So the problem is the validation split but there is no indication of this in the error message. Need to check it more closely

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

maryam2013 commented 6 years ago

Hi there, I wrote these commands :

nb_classes=2 print('Build model...') model_rnn = Sequential() model_rnn.add(Embedding(vocab_dic_size, 128)) model_rnn.add(SimpleRNN(128, dropout=0.2, recurrent_dropout=0.2)) np_utils.to_categorical(y_datasetpad, nb_classes=2) model_rnn.add (categorical_labels = to_categorical(y_datasetpad, nb_classes=2)) model_rnn.add(Dense(2, activation='softmax'))

model_rnn.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

print('Train...') model_rnn.fit(x_datasetpad, y_datasetpad, batch_size=32, epochs=2, validation_data=(x_datasetpad, y_datasetpad ))

my purpose is to achieve the probability of belonging of each sample to each class. I wanna gain 40% for class 1 and 60% for class0 (for example) as the output of softmax and I do not want to gain accuracy. but it gave me this error : File "/home/mary/.config/spyder-py3/finalized_rnn.py", line 182, in model_rnn.add (categorical_labels = to_categorical(y_datasetpad, nb_classes=2))

TypeError: to_categorical() got an unexpected keyword argument 'nb_classes' how can I fix this error? any comments will be appreciated.

chih-hong commented 6 years ago

ValueError: Input dimension mis-match. (input[0].shape[1] = 512, input[1].shape[1] = 2048) Apply node that caused the error: Elemwise{Composite{(i0 + (((i1 - i2) / i3) i4) + i5)}}(ctx_res5b, CorrMM{valid, (1, 1), (1, 1)}.0, InplaceDimShuffle{x,0,x,x}.0, Elemwise{Composite{sqrt((i0 + i1))}}.0, InplaceDimShuffle{x,0,x,x}.0, InplaceDimShuffle{x,0,x,x}.0) Toposort index: 268 Inputs types: [TensorType(float32, 4D), TensorType(float32, 4D), TensorType(float32, (True, False, True, True)), TensorType(float32, (True, False, True, True)), TensorType(float32, (True, False, True, True)), TensorType(float32, (True, False, True, True))] Inputs shapes: [(2, 512, 7, 7), (2, 2048, 7, 7), (1, 2048, 1, 1), (1, 2048, 1, 1), (1, 2048, 1, 1), (1, 2048, 1, 1)] Inputs strides: [(100352, 196, 28, 4), (401408, 196, 28, 4), (8192, 4, 4, 4), (8192, 4, 4, 4), (8192, 4, 4, 4), (8192, 4, 4, 4)] Inputs values: ['not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown'] Outputs clients: [[Elemwise{Composite{(i0 (i1 + Abs(i1)))}}(TensorConstant{(1, 1, 1, 1) of 0.5}, Elemwise{Composite{(i0 + (((i1 - i2) / i3) i4) + i5)}}.0), Elemwise{Composite{((((i0 i1) + (i0 i1 sgn(i2))) i3) / i4)}}[(0, 1)](TensorConstant{(1, 1, 1, 1) of 0.5}, AveragePoolGrad{ignore_border=True, mode='average_exc_pad', ndim=2}.0, Elemwise{Composite{(i0 + (((i1 - i2) / i3) i4) + i5)}}.0, InplaceDimShuffle{x,0,x,x}.0, Elemwise{Composite{sqrt((i0 + i1))}}.0)]]

How can I fix this error?