keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.98k stars 19.47k forks source link

Auto encoder classification in Keras #8609

Closed Tinarights closed 3 years ago

Tinarights commented 6 years ago

Hello, I have an issue i think it is from dimensions: I am trying to find a useful code for improving classification using autoencoder. I followed this example keras autoencoder vs PCA But not for MNIST data, I tried to use it with cifar-10

so I made some changes but it seems like something is not fitting. Could any one please help me in this? if you have another example that can run in different dataset, that would help.

the validation in reduced.fit, which is (X_test,Y_test) is not learned, so it gives wronf accuracy in .evalute() always give val_loss: 2.3026 - val_acc: 0.1000 This is the code, and the error:

`

from keras.datasets import  cifar10
from keras.models import Model
from keras.layers import Input, Dense
from keras.utils import np_utils
import numpy as np

num_train = 50000
num_test = 10000

height, width, depth = 32, 32, 3 # MNIST images are 28x28
num_classes = 10 # there are 10 classes (1 per digit)

(X_train, y_train), (X_test, y_test) = cifar10.load_data()

X_train = X_train.reshape(num_train,height * width * depth)
X_test = X_test.reshape(num_test,height * width*depth)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255 # Normalise data to [0, 1] range
X_test /= 255 # Normalise data to [0, 1] range

Y_train = np_utils.to_categorical(y_train, num_classes) # One-hot encode the labels
Y_test = np_utils.to_categorical(y_test, num_classes) # One-hot encode the labels

input_img = Input(shape=(height * width * depth,))
s=height * width * depth
x = Dense(s, activation='relu')(input_img)

encoded = Dense(s//2, activation='relu')(x)
encoded = Dense(s//8, activation='relu')(encoded)

y = Dense(s//256, activation='relu')(x)

decoded = Dense(s//8, activation='relu')(y)
decoded = Dense(s//2, activation='relu')(decoded)

z = Dense(s, activation='sigmoid')(decoded)
model = Model(input_img, z)

model.compile(optimizer='adadelta', loss='mse') # reporting the accuracy

model.fit(X_train, X_train,
      nb_epoch=10,
      batch_size=128,
      shuffle=True,
      validation_data=(X_test, X_test))

mid = Model(input_img, y)
reduced_representation =mid.predict(X_test)

out = Dense(num_classes, activation='softmax')(y)
reduced = Model(input_img, out)
reduced.compile(loss='categorical_crossentropy',
          optimizer='adam',
          metrics=['accuracy'])

reduced.fit(X_train, Y_train,
      nb_epoch=10,
      batch_size=128,
      shuffle=True,
      validation_data=(X_test, Y_test))

scores = reduced.evaluate(X_test, Y_test, verbose=0)
print("Accuracy: ", scores[1])

`

Here is the output

Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 5s - loss: 0.0639 - val_loss: 0.0633
Epoch 2/10
50000/50000 [==============================] - 5s - loss: 0.0610 - val_loss: 0.0568
Epoch 3/10
50000/50000 [==============================] - 5s - loss: 0.0565 - val_loss: 0.0558
Epoch 4/10
50000/50000 [==============================] - 5s - loss: 0.0557 - val_loss: 0.0545
Epoch 5/10
50000/50000 [==============================] - 5s - loss: 0.0536 - val_loss: 0.0518
Epoch 6/10
50000/50000 [==============================] - 5s - loss: 0.0502 - val_loss: 0.0461
Epoch 7/10
50000/50000 [==============================] - 5s - loss: 0.0443 - val_loss: 0.0412
Epoch 8/10
50000/50000 [==============================] - 5s - loss: 0.0411 - val_loss: 0.0397
Epoch 9/10
50000/50000 [==============================] - 5s - loss: 0.0391 - val_loss: 0.0371
Epoch 10/10
50000/50000 [==============================] - 5s - loss: 0.0377 - val_loss: 0.0403
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 3s - loss: 2.3605 - acc: 0.0977 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 2/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0952 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 3/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0980 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 5/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0974 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 6/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.1000 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 7/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0992 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 8/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0982 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 9/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0965 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
 9856/10000 [============================>.] - ETA: 0s('Accuracy: ', 0.10000000000000001)
xuefeng7 commented 6 years ago

Is there anyway to train the autoencoder and classifier jointly? I think the representation can be optimized to give the best classification results instead of minimizing the reconstruction loss solely.

FrancisYizhang commented 6 years ago

@xuefeng7 I am also interested in it. Did you have solution?

10037

yashenkoxciv commented 6 years ago

Yep, you can "connect" features from discriminator (classifier) to decoder. Another approach provided with Adversarial Autoencoders: https://towardsdatascience.com/a-wizards-guide-to-adversarial-autoencoders-part-4-classify-mnist-using-1000-labels-2ca08071f95

ChristineDewi commented 5 years ago

hello everyone, I have the same problem. I am trying to find a useful code for improving classification using autoencoder. I followed this example keras autoencoder vs PCA But not for MNIST data, I tried to use it with GTSR dataset. This is my code

from keras.layers import Input, Dense from keras.models import Model from keras import regularizers from keras.datasets import mnist from keras import backend as K import numpy as np import matplotlib.pyplot as plt import pickle from matplotlib import pyplot

import cv2 import pandas as pd

Single fully-connected neural layer as encoder and decoder

use_regularizer = True my_regularizer = None my_epochs = 100 features_path = 'simple_autoe_features.pickle' labels_path = 'simple_autoe_labels.pickle'

if use_regularizer:

add a sparsity constraint on the encoded representations

# note use of 10e-5 leads to blurred results
# my_regularizer = regularizers.l1(10e-8)
my_regularizer = regularizers.l1(10e-12)
# and a larger number of epochs as the added regularization the model
# is less likely to overfit and can be trained longer
my_epochs = 100
features_path = 'sparse_autoe_features.pickle'
labels_path = 'sparse_autoe_labels.pickle'

this is the size of our encoded representations

encoding_dim = 2048 # 32 floats -> compression factor 24.5, assuming the input is 784 floats

this is our input placeholder; 784 = 28 x 28

input_img = Input(shape=(1024, ))

"encoded" is the encoded representation of the inputs

encoded = Dense(encoding_dim, activation='relu', activity_regularizer=my_regularizer)(input_img)

"decoded" is the lossy reconstruction of the input

decoded = Dense(1024, activation='sigmoid')(encoded)

this model maps an input to its reconstruction

autoencoder = Model(input_img, decoded)

Separate Encoder model

this model maps an input to its encoded representation

encoder = Model(input_img, encoded)

Separate Decoder model

create a placeholder for an encoded (32-dimensional) input

encoded_input = Input(shape=(encoding_dim,))

retrieve the last layer of the autoencoder model

decoder_layer = autoencoder.layers[-1]

create the decoder model

decoder = Model(encoded_input, decoder_layer(encoded_input))

Train to reconstruct MNIST digits

from keras import optimizers from keras.optimizers import SGD

opt = SGD(lr=0.1, momentum=0.9)

adam = optimizers.Adam(lr=0.01)

configure model to use a per-pixel binary crossentropy loss, and the Adadelta optimizer

autoencoder.compile(optimizer='adam', loss='binary_crossentropy' , metrics=['accuracy'])

customAdam = optimizers.Adam(lr=0.001) #you have no idea how many times I changed this number autoencoder.compile(optimizer=customAdam, # Optimizer

Loss function to minimize

          loss="mean_squared_error",
          # List of metrics to monitor
          metrics=["accuracy"])

prepare input data

(xtrain, ), (x_test, y_test) = mnist.load_data()

train = pd.read_pickle('./traffic-signs-data/train.p') test = pd.read_pickle('./traffic-signs-data/test.p') (xtrain1, ) = train['features'], train['labels']

x_train = [] x_test = []

for i in x_train1: i = cv2.cvtColor(i, cv2.COLOR_RGB2GRAY) x_train.append(i)

(x_test1, y_test) = test['features'], test['labels'] for i in x_test1: i = cv2.cvtColor(i, cv2.COLOR_RGB2GRAY) x_test.append(i)

normalize all values between 0 and 1 and flatten the 28x28 images into vectors of size 784

x_train = np.array(x_train) x_test = np.array(x_test) x_train = x_train.astype('float32') / 255. x_test = x_test.astype('float32') / 255. x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:]))) x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:]))) print(x_train.shape) print(x_test.shape)

Train autoencoder for 50 epochs

history = autoencoder.fit(x_train, x_train, epochs=my_epochs, batch_size=128, shuffle=True, validation_data=(x_test, x_test), verbose=2)

after 50/100 epochs the autoencoder seems to reach a stable train/test lost value

Visualize the reconstructed encoded representations

encode and decode some digits

note that we take them from the test set

encoded_imgs = encoder.predict(x_test) decoded_imgs = decoder.predict(encoded_imgs)

evaluate the model

_, train_acc = autoencoder.evaluate(x_train, xtrain, verbose=0) , test_acc = autoencoder.evaluate(x_test, x_test, verbose=0) print(train_acc, test_acc)

plot loss during training

pyplot.subplot(211) pyplot.title('Loss') pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend()

plot accuracy during training

"""pyplot.subplot(212) pyplot.title('Accuracy') pyplot.plot(history.history['train_acc'], label='train') pyplot.plot(history.history['val_acc'], label='test') pyplot.legend() pyplot.show()"""

save latent space features 32-d vector

pickle.dump(encoded_imgs, open(features_path, 'wb')) pickle.dump(y_test, open(labels_path, 'wb'))

n = 6 # how many digits we will display plt.figure(figsize=(10, 2), dpi=100) for i in range(n):

display original

ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(32, 32))
plt.gray()
ax.set_axis_off()

# display reconstruction
ax = plt.subplot(2, n, i + n + 1)
plt.imshow(decoded_imgs[i].reshape(32, 32))
plt.gray()
ax.set_axis_off()

plt.show()

K.clear_session()

here is the output Epoch 99/100

please somebody help me if you get the answer.