Closed fchollet closed 6 years ago
[EDIT:DONE]
Not pressing, I guess, but the (non-)cropping issue (PR #4941) would be nice to either fix, or at very least fix the documentation/code that defaults to a non-working crop of ((0,0) (0,0)).
The issue applies to Cropping2D and 3D; 1D works fine.
[EDIT: DONE]
The only other thing I can think of is the DeConvolution layers, specifically in Theano. 'tf' ordering does not work as expected, but I have not drilled into it.
Some example code (I haven't opened an issue or anything yet. Trying to understand DeConv fully first):
from keras.models import Sequential
from keras.layers import ZeroPadding2D, Deconvolution2D
from keras import callbacks
import numpy as np
# This gives an unexpected output shape
model = Sequential()
model.add(Deconvolution2D(3, 3, 3, output_shape=(None, 14, 14, 3), border_mode='valid', dim_ordering='tf', input_shape=(12, 12, 3)))
model.compile(loss='mse', optimizer='adam')
print 'Actual Output Shape: ', model.predict(np.ones((1, 12, 12, 3))).shape
# >> Actual Output Shape: (1, 14, 3, 3)
# This makes it seem like the image was 12x3 with 12 channels, or something
# This works, and gives reasonable output
model = Sequential()
model.add(Deconvolution2D(3, 3, 3, output_shape=(None, 3, 14, 14), border_mode='valid', dim_ordering='th', input_shape=(3, 12, 12)))
model.compile(loss='mse', optimizer='adam')
print 'Actual Output Shape: ', model.predict(np.ones((1, 3, 12, 12))).shape
# >> Actual Output Shape: (1, 3, 14, 14)
# This is as expected, and matches the doc
[EDIT: DONE]
Not sure of your definition of pressing, but --
I think there might be an issue with saving deconv layers with the tf backend.
from keras.models import Sequential, load_model
from keras.layers import Deconvolution2D
model = Sequential()
model.add(Deconvolution2D(3, 3, 3, output_shape=(None, 14, 14, 3), input_shape=(12, 12, 3)))
model.save('tmp.model')
del model
model = load_model('tmp.model')
gives an error for me when I'm using the tf backend.
I added a tuple cast that seems to fix it for me in #4999
(I originally had a second bug here, but it seems to have been fixed!)
I've made 2 PRs: one to fix a bug in DeConv2D, and then a general bug that I found relating to initializations (generally with convolutoinal layers); see #5000 and #5002
[EDIT: FIXED IN RECENT VERSIONS OF TENSORFLOW]
As mentioned in #4991, there seems to be a consistency issue for keras backend. It seems that keras just assume basic operators (__add__
) among tensors have similar broadcasting behavior in both theano and tensorflow. But I have come across a case earlier that the behaviors of those two backends on basic operators are not quite same. Here is an example, when a
is a matrix and b
is a vector. Theano backend would broadcast but tensorflow would not do that. Right now, I use the following way to work around.
def k_div(x, y):
try:
return x/y
except:
import tensorflow as tf
return tf.div(x, y)
Since I didn't find any operator overload in source code, I think it might be better to check the consistency and overload operators for keras's own tensor class. Correct me if I misunderstood anything. Thanks.
@iammarvelous The below code works on both Theano 0.8.2 and Tensorflow r0.11, python 2.7. Both Theano and TensorFlow (AFAIK) follow the SciPy broadcasting guidelines.
import keras.backend as K
import numpy as np
def division_ex(mat, vec):
return mat/vec
mat = K.placeholder(shape=(28, 28))
vec = K.placeholder(shape=(28,))
division_fn = division_ex(mat, vec)
fn = K.function(inputs=[mat, vec], outputs=[division_fn])
# 8.0 / 4.0 == 2.0 ??
print K.backend(), np.allclose(fn([np.ones((28,28))*8., np.ones((28,))*4]), np.ones((28,28))*2.0)
# >> theano True
# <switch backends>
# >> tensorflow True
[EDIT: FIXED A LONG TIME AGO]
As mentioned in #4954, the TimeDistributed wrapper does not work with Recurrent Layers in the 1.2.0 version. The same code worked on a previous version of Keras (1.1.2).
I get the following error on Keras 1.2.0
File "/home/##/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 569, in __call__
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/home/##/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 632, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/home/##/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 164, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/home/##/.local/lib/python2.7/site-packages/keras/layers/wrappers.py", line 129, in call
y = self.layer.call(X) # (nb_samples * timesteps, ...)
File "/home/##/.local/lib/python2.7/site-packages/keras/layers/recurrent.py", line 201, in call
input_shape = K.int_shape(x)
File "/home/##/.local/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 128, in int_shape
raise Exception('Not a Keras tensor:', x)
Exception: ('Not a Keras tensor:', Reshape{3}.0)
The code snipped is written below
input_layer = Input(shape=(document_size, img_h,), dtype='int32', name='input_layer')
embedding_layer = TimeDistributed(Embedding(len(W2V), img_w, input_length=img_h, weights=[W2V], trainable=True, mask_zero=True))(input_layer)
gru_word = TimeDistributed(GRU(GRU_layers[0], return_sequences=True, activation=conv_non_linear))(embedding_layer)
Can we now have many BN layers with mode=0?
@keunwoochoi What do you mean many BN layers? I think you can now have as many shared/unshared/whatever mode=0 BN layers as you want.
BatchNorm layers - unless that's not what you meant, haha.
Related issues are #3903 #4334 #2954.
@keunwoochoi not sure the specifics of what you mean, but the issues you are referring to have been fixed a long time ago.
@fchollet I see, thanks.
@patyork @iammarvelous
For the divsion example, tensorflow will fail if you change the mat,vec shape from [28, 28]/[28]
to [7, 28]/[7]
, tensorflow will raise exception while theano goes fine.
import os
# os.environ['KERAS_BACKEND'] = 'theano'
import keras.backend as K
import numpy as np
def division_ex(mat, vec):
return mat/vec
mat = K.placeholder(shape=(7, 28))
vec = K.placeholder(shape=(7,))
division_fn = division_ex(mat, vec)
fn = K.function(inputs=[mat, vec], outputs=[division_fn])
# 8.0 / 4.0 == 2.0 ??
print K.backend(), np.allclose(fn([np.ones((28,28))*8., np.ones((28,))*4]), np.ones((28,28))*2.0)
Theano 0.8.2, tensorflow 0.12.0-rc-1
The following Keras model runs smoothly under tensorflow, but gives "theano.gof.fg.MissingInputError" on the last line under theano (X and y can be any dummy data)
X = np.random.rand(100000, 320)
y = np.random.rand(100000, 1)
model = Sequential()
model.add(TimeDistributed(Dropout(0.5), batch_input_shape=(1, 100000, 320)))
model.add(TimeDistributed(Dense(1000)))
model.add(TimeDistributed((Dropout(0.5))))
model.add(LSTM(1000, batch_input_shape=(1, 100000, 1000),
return_sequences=True, stateful=True))
model.add(TimeDistributed((Dropout(0.5))))
model.add(TimeDistributed(Dense(1)))
opti = RMSprop(lr=0.001)
model.compile(optimizer=opti, loss='mse')
model.train_on_batch(X[np.newaxis, 0:100000, :], y[np.newaxis, 0:100000])
The average pool layer doesn't support masking. The zero padding then throws off the calculation of the average. I think you just need to set the flag.
(This is out of thread scope and probably discussed elsewhere. But: I think the sequence handling in Keras should probably be redesigned. The current approach loses information: after the padding you don't know the true sequence boundaries, so you have to guess from the pad value. This is problematic as soon as you embed. I've found it quite difficult to implement pooling and attention models correctly. I think a design that maintains an array of the original sequence lengths might work better.)
Do the Deconvolution2D layers still require output_shape ? That's a common source of issues for me and is hard to work with.. Especially because you have to explicitly pass in the batch size to the output_shape instead of using "-1" (as in numpy or tensorflow) or just not requiring the batch size at all.
Example:
Required:
deconv1 = Deconvolution2D(10,5,5,output_shape=[batch_size]+conv1.get_shape().as_list()[1:])(conv1)
Instead of:
deconv1 = Deconvolution2D(10,5,5,output_shape=[-1]+conv1.get_shape().as_list()[1:])(conv1)
or not having that at all. Definitely not a bug, but a common source of user issues.
Referring to my previous comment, I see you can use 'None' instead of -1 or batch_size.. awesome. However, there is a common bug that occurs in the Deconvolution2D class when you use None, which originates from tf.nn.deconv2d but can easily be alleviated in keras: "TypeError: Expected binary or unicode string, got None"
The fix is to add tf.stack() in tensorflow_backend.py on line 2472:
def _preprocess_deconv_output_shape(x, shape, dim_ordering):
if dim_ordering == 'th':
shape = (shape[0], shape[2], shape[3], shape[1])
if shape[0] is None:
shape = (tf.shape(x)[0], ) + tuple(shape[1:])
shape = tf.stack(list(shape))
return shape
Example issue from TF that carries over to keras: https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/vf8eH9YMwVA
Hi all,
I am I don't know if this thread is still open, but I have been experiencing some issues with bidirectional LSTMs and the functional API and this seems like the right place to get feedback.
I don't know if I got this right, but I think that:
bdlstm=Sequential()
bdlstm.add(Bidirectional(LSTM(32, return_sequences=True), input_shape=(None, 128)))
layer_3_merge_x = bdlstm(layer_1)
and
layer_2_lstm_forward = LSTM(32, return_sequences=True)(layer_1)
layer_2_lstm_backward = LSTM(32, return_sequences=True, go_backwards=True)(layer_1)
layer_3_merge_x = concatenate([layer_2_lstm_forward, layer_2_lstm_backward])
should behave identically. I tried multiple ways of merging the two layers (concatenating/summing etc.), but still no luck.
However, this only seems to happen if the number of timesteps is fixed for the input data. Whenever I used non-fixed inputs it seemed like the second layer (go_backwards=True) was not working properly.
Hope I got this right and many thanks for the developers of Keras (really great framework).
Best, Tibi
@tiberiu44 please open a new issue for this if necessary.
Closing this issue as keras 2.0 has been released quite some time ago.
We will soon be cutting a final Keras v1 in preparation for upcoming development on Keras v2.
What are some pressing issues that you want to see addressed in Keras v1 before we freeze it? Anything I have been missing? No feature requests, only bugs.