Closed aironashish closed 8 years ago
you defined your batch size as 64 but at one point passes only 8 samples as input. please check where you may be doing that by accident. note that if you are using theano backend you may use "undefined" batch size and avoid that problem altogether
I checked, there is no way 8 samples are set as input. You can see the code below
batch_size = 8
original_dim = 678
latent_dim = 2
intermediate_dim = 45
nb_epoch = 5
x = Input(batch_shape=(batch_size, original_dim))
h = Dense(intermediate_dim, activation='relu')(x)
z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
def sampling(args):
z_mean, z_log_var = args
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.)
return z_mean + K.exp(z_log_var / 2) * epsilon
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])
decoder_h = Dense(intermediate_dim, activation='relu')
decoder_mean = Dense(original_dim, activation='softmax')
h_decoded = decoder_h(z)
x_decoded_mean = decoder_mean(h_decoded)
def vae_loss(x, x_decoded_mean):
xent_loss = objectives.categorical_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return xent_loss + kl_loss
vae = Model(input=x, output=x_decoded_mean)
vae.compile(optimizer='adam', loss="categorical_crossentropy")
vae.fit(x_train, x_train,
shuffle=False,
nb_epoch=nb_epoch,
batch_size=batch_size,
validation_data=(x_test, x_test))
The other problem is if I set the batch size to 8 it give the following error
Input dimension mis-match. (input[1].shape[0] = 4, input[3].shape[0] = 8)
Apply node that caused the error: Elemwise{Composite{(exp((i0 * (i1 + i2))) * i3)}}[(0, 1)](TensorConstant{(1, 1) of 0.5}, Dot22.0, InplaceDimShuffle{x,0}.0, Reshape{2}.0)
Toposort index: 20
Inputs types: [TensorType(float32, (True, True)), TensorType(float32, matrix), TensorType(float32, row), TensorType(float32, matrix)]
Inputs shapes: [(1, 1), (4, 2), (1, 2), (8, 2)]
Inputs strides: [(4, 4), (8, 4), (8, 4), (8, 4)]
Inputs values: [array([[ 0.5]], dtype=float32), 'not shown', array([[-0.13515097, 0.00348977]], dtype=float32), 'not shown']
Outputs clients: [[Gemm{inplace}(Elemwise{Composite{(exp((i0 * (i1 + i2))) * i3)}}[(0, 1)].0, TensorConstant{1.0}, Elemwise{Composite{(i0 * (Abs((i1 + i2)) + i1 + i2))}}[(0, 1)].0, dense_12_W, TensorConstant{1.0})]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
how many samples you have in your dataset? is it a multiple of 8. This line is what worries me: epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.)
but I think keras forces the same number of samples in the batch during training, so that should not be hapenning during training time.
what is the size of the test set?
Train dataset shape : (22664, 678) Test dataset shape : (1972, 678)
If I run on a GPU, this is the error
ValueError: GpuElemwise. Input dimension mis-match. Input 3 (indices start at 0) has shape[0] == 8, but the output's size on that axis is 4.
Apply node that caused the error: GpuElemwise{Composite{(exp((i0 * (i1 + i2))) * i3)}}[(0, 1)](CudaNdarrayConstant{[[ 0.5]]}, GpuDot22.0, GpuDimShuffle{x,0}.0, GpuReshape{2}.0)
Toposort index: 25
Inputs types: [CudaNdarrayType(float32, (True, True)), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, row), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(1, 1), (4, 2), (1, 2), (8, 2)]
Inputs strides: [(0, 0), (2, 1), (0, 1), (2, 1)]
Inputs values: [CudaNdarray([[ 0.5]]), 'not shown', CudaNdarray([[-0.08925677 -0.15396592]]), 'not shown']
Outputs clients: [[GpuGemm{inplace}(GpuElemwise{Composite{(exp((i0 * (i1 + i2))) * i3)}}[(0, 1)].0, TensorConstant{1.0}, GpuElemwise{Composite{(i0 * ((i1 + i2) + Abs((i1 + i2))))}}[(0, 1)].0, dense_30_W, TensorConstant{1.0})]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
just to test my hypothesis, could you please run the experiment with batch_size=4
? 1972 is not divisible by 8
It works with batch size = 4. So divisibility is the only problem ? But, I never had a such problem in Keras !
Its because of your epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.)
in the sampling layer. epsilon
ALWAYS have batch_size
rows. A better way should have different behavior for test time where no sampling is used and only the mean
is forward.
Using Lambda
is a quick hack but forces you to make batch_size
well defined.
Thanks a lot !
Hi,
I am using the variational auto encoder described in http://blog.keras.io/building-autoencoders-in-keras.html.
My input shape is (22664, 678) and I have changed loss to "categorical_crossentropy". I am not using the customized loss function(I thought that me be causing the problem). And following are parameters values
batch_size = 64 original_dim = 678 latent_dim = 2 intermediate_dim = 45 nb_epoch = 5
My input contains value ranging from 0.5 to 21(most of them are 0s).
The program is giving the following error
Input dimension mis-match. (input[0].shape[0] = 8, input[1].shape[0] = 64) Apply node that caused the error: Elemwise{mul,no_inplace}(Elemwise{Composite{exp((i0 * (i1 + i2)))}}[(0, 1)].0, Reshape{2}.0) Toposort index: 47 Inputs types: [TensorType(float32, matrix), TensorType(float32, matrix)] Inputs shapes: [(8, 2), (64, 2)] Inputs strides: [(8, 4), (8, 4)] Inputs values: ['not shown', 'not shown'] Outputs clients: [[Gemm{inplace}(Elemwise{mul,no_inplace}.0, TensorConstant{1.0},Elemwise{Composite{(i0 * (Abs(i1) + i2 + i3))}}[(0, 2)].0, dense_47_W, TensorConstant{1.0})]] HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Can someone please explain the reasons behind this?