Problems with running simple autoencoder in R keras

fedxa commented 4 years ago

Hello, I am trying to create a simple autoencoder. The python version from https://blog.keras.io/building-autoencoders-in-keras.html works perfectly. However, the version in R (see also https://github.com/rstudio/keras/issues/228 )

library(keras)
# use_condaenv()

# this is the size of our encoded representations
batch_size <- 256L
# this is the size of our encoded representations
original_dim <- 784L
latent_dim <- 784L
encoding_dim <- 32L

# this is our input placeholder
input_img <- layer_input(shape = original_dim)
encoded<- layer_dense(input_img,encoding_dim , activation = "relu")
# "decoded" is the lossy reconstruction of the input
decoded<- layer_dense(encoded, latent_dim , activation = "sigmoid")

# this model maps an input to its reconstruction
autoencoder = keras_model(input_img, decoded)

summary(autoencoder)

# this model maps an input to its encoded representation
encoder = keras_model(input_img, encoded)

# create a placeholder for an encoded (32-dimensional) input
encoded_input = layer_input(shape=c(encoding_dim))
# retrieve the last layer of the autoencoder model
decoder_layer = get_layer(autoencoder, index=3)
# create the decoder model
decoder = keras_model(encoded_input, decoder_layer(encoded_input))

autoencoder  %>% compile(optimizer = 'adadelta', loss = 'binary_crossentropy')

summary(autoencoder)

mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y

# reshape
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
# rescale
x_train <- x_train / 255
x_test <- x_test / 255

print(dim(x_train))
print(dim(x_test))

mode(x_train) <- "single"
mode(x_test) <- "single"

fit(autoencoder, x_train, x_train,
                epochs=20,
                batch_size=256,
                shuffle=TRUE,
                validation_data=list(x_test, x_test))

# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = predict(encoder,x_test)
decoded_imgs = predict(decoder,encoded_imgs)

image(array_reshape(x_train[3,], dim=c(28,28))[,28:1], col=gray.colors(255))
image(array_reshape(decoded_imgs[3,], dim=c(28,28))[,28:1], col=gray.colors(255))

produces just noise in the decoded image. Also, during the fit step the loss is always ~0.69, and does not fall like in the Python version.

Any idea what the bug it may be?

I use R 3.6.3 with keras 2.2.5.0, with default tensorflow 2.0 installed via conda environment. The first classification example from https://keras.rstudio.com/articles/getting_started.html works, by the way.

Any help would be appreciated.

turgut090 commented 4 years ago

Hi. The result (the same loss 0.69) for R and Python is the same on my OS. Can you find out from R and Python the version of tensorflow and keras? They are probably not the same. Provided Keras example is old. New examples are currently being updated by François Chollet here https://github.com/keras-team/keras-io

Python

import tensorflow as tf
import keras as k
tf.__version__
k.__version__

R

tensorflow::tf_version()
keras:::keras_version()

fedxa commented 4 years ago

I have the following: Python

>>> tf.__version__
'2.0.0'
>>> k.__version__
'2.3.1'

and R

> tensorflow::tf_version()
[1] ‘2.0’
> keras:::keras_version()
[1] ‘2.2.4’

I also checked on another OS (macOS with fresh R installation) with exactly the same result -- python version works, R version does not

fedxa commented 4 years ago

Ah, I see some catch: in python I was using

import keras

and got imported the keras 2.3.1 package. While R seems to use the following:

>>> import tensorflow.keras as kk
>>> kk.__version__
'2.2.4-tf'

That is the keras within tensorflow, with another version. Hmm.

fedxa commented 4 years ago

Ok, for myself I solved the problem:

use_implementation("keras")

uses keras python module instead of tensorflow.keras, and it works like a charm after, with the following versions of the packages:

> tensorflow::tf_version()
[1] ‘2.0’
> keras:::keras_version()
[1] ‘2.3.1’

This is strange, though -- seems the default tensorflow keras implementation (at least in 2.0.0 tensorflow) is incompatible with tensorflow...

I also notice one difference between the models created with two versions of keras. The "good" one (from keras 2.3.1) looks like:

> summary(autoencoder)
Model: "model_1"
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
input_1 (InputLayer)                (None, 784)                     0           
________________________________________________________________________________
dense_1 (Dense)                     (None, 32)                      25120       
________________________________________________________________________________
dense_2 (Dense)                     (None, 784)                     25872       
================================================================================
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
________________________________________________________________________________

while the "bad" one (with keras 2.2.4) looks like

 > Model: "model"
________________________________________________________________________________
Layer (type)                        Output Shape                    Param #     
================================================================================
input_1 (InputLayer)                [(None, 784)]                   0           
________________________________________________________________________________
dense (Dense)                       (None, 32)                      25120       
________________________________________________________________________________
dense_1 (Dense)                     (None, 784)                     25872       
================================================================================
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
________________________________________________________________________________

I I don't know what it means, but there are brackets around Output shape of the Input layer in the "bad" vesion.

fedxa commented 4 years ago

Rose the issue in tensorflow -- it turns out to be a problem with different defaults for learning rate for Adadelta optimizer between tensorflow and standalow keras versions: https://github.com/tensorflow/tensorflow/issues/39432

rstudio / keras3

Problems with running simple autoencoder in R keras #1037

Python

R