Closed Atralb closed 5 years ago
@Atralb , It is clearly expecting a shape of rank 1 in the training stage, from your definition of the training shape in the initial Flatten layer. Can you try to see if it works with a rank 1 output layer (in the Reshape). Thanks.
@msymp Thanks for your answer. However, I don't understand what you mean by rank 1 shape. Do you mean a tensor with only 1 dimension ? If it is that, I don't see why, the x_train is of shape (60000,28,28) so x_train[1:] is of shape (28,28).
Hi @ParikhKadam , can you assist @Atralb with the shape error that occurs at the training stage in the above code. Thanks.
@Atralb @msymp
Will explain my answer later but for now, try this code:
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout, MaxPooling2D, Flatten, Reshape
import matplotlib.pyplot as plt
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train/256.0
x_train = np.expand_dims(x_train, axis=-1)
model3 = Sequential()
model3.add(Flatten(input_shape=(28,28,1)))
model3.add(Dense(300, activation='relu'))
model3.add(Dense(150, activation='relu'))
model3.add(Dense(300, activation='relu'))
model3.add(Dense(784, activation='relu'))
model3.add(Reshape((28,28,1)))
model3.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model3.summary()
model3.fit(x_train,x_train, epochs=10, batch_size=50)
It will definitely work.. Time to sleep. Ttyl..
@msymp @ParikhKadam Thanks a lot for your help guys. Looking forward to your answer tomorrow Parikh :). I am kind of seeing the issue now but still not clearly understanding it.
This however raised two other issues for me :
PS : this is one of the results after 10 epochs. The sample seems to have been made noisy at the beginning of the architecture, maybe during the flatten layer? (if somebody still wants to see it ^^ https://user-images.githubusercontent.com/35039163/51411496-5ea60c80-1b68-11e9-92f8-2c17bc831162.png)
EDIT : Oh god I'm dumb. I still treated it as a categorical problem with the loss function... My bad. It worked when I changed it to MSE
@Atralb @msymp
(if somebody still wants to see it ^^ https://user-images.githubusercontent.com/35039163/51411496-5ea60c80-1b68-11e9-92f8-2c17bc831162.png
Had a look at it..you forgot to denormalize the predicted values. It's no problem if you change the loss function but to make the model work with this loss function, you should denormalize the predicted values by multiplying the predicted output tensor by 256. And then plot the image...
Now here comes the explanation. I made two changes to your training data:
x_train = x_train/256.0
-- I was forced to normalize the values as the loss function you mentioned in your code accepted values in range on [0,1) only. Take a look another time at the range -- 1 is not included.The additional normalization (division by 256) you made, which I thought was only meant for efficiency purposes, but I noticed it is a requirement for it to run otherwise I get an error of values out of range [0,1]. Why is it so ?
I gave answer to this above in the first point.. But I look like to explain more on this. Normalization isn't done this way (dividing by 256). It is actually done by subtracting the mean and diving the result by standard deviation. But in case of images such as MNIST, the pixels holds values from 0 to 255. Hence, we should actually divide the input tensor by 255 (for simplest form of normalization). But I divided it by 256. Reason is the range [0,1) where 1 isn't included and dividing the highest pixel value in the data (255) by 255 will give you 1.
x_train = np.expand_dims(x_train, axis=-1)
-- Note that you used Flatten
layer and flatten layer is specially made for flattening the images in Keras. Flatten layer has a special argument which describes the format of images which are two: channels_first & channels_last. By default, it accepts the argument as channels_last
if not specified when instantiating the Flatten
layer.Now, MNIST are black and white images. Hence, the number of channels is 1. A single channel which describes the color in image: 0 for white, 128 for grey and 255 for black. Your input was of shape (28, 28) and it didn't contain channel related information. As black and white images contain a single channel and the flatten layer takes channels_last by default, we modified the input to have shape (28,28,1).
The shape of input images is generally (height, width, no_of_channels) or (no_of_channels, height, width)
That's all for the explanation.
Now, I saw that you modified the loss function to MSE. Crosscheck if MSE needs the input values in range [0,1). If not, you can remove the normalization step..
Thank you.. Keep me assigning issues and I will solve them in my free time..
Learn from others' mistakes :)
@ParikhKadam For the discussion about the loss function, I am sorry but you are wrong on that. Denormalizing the values doesn't change anything, and you could already see in the noisy image. There's no way it would have been like that if the original image was a number. You would have seen the pattern. And as expected, multiplying by 256.0 doesn't change anything to the noisiness of the image, and no information can be extracted from it. The results with this loss function are simply wrong. And as I said, the training starts at the very beginning with a loss 0. This right away proves there's a problem and the predictions are wrong.
Thanks for the channel explanation though, I better understand now. And for your overall help :)
@Atralb Thank you for notifying.. I understood that we can't train this model on that loss function. I will still try that myself.
For now, is the model running perfectly with the new changes applied? I haven't tried running this code so just asked for confirmation.
Thank you..
Yep with an MSE loss function, it works perfectly and makes a very easy and simple autoencoder for my data ! (I guess this should be marked as solved ?)
@Atralb Ohk.. Yep..mark it as solved and close this issue.
This issue is closed. Thanks @ParikhKadam and @Atralb , this discussion was very clarifying.
Welcome.. @msymp
Hi all,
First of all I must say I'm very new to Tensorflow, Keras and Deep Learning as a whole.
I am trying to create a very simple autoencoder on my own with keras Sequential model on MNIST dataset, as follows :
This is the error message I got and I can't figure out where does the error come from. I imagine it's from the reshape layer at the end but I'm not completely sure, and even so I don't know why it's wrong. I already tried with
model3.add(Reshape((28,28,1)))
without successThanks for your help :)