keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.98k stars 19.48k forks source link

CNN-LSTM with video frame sequence: InvalidArgumentError: You must feed a value for placeholder tensor #5934

Closed Feynman27 closed 3 years ago

Feynman27 commented 7 years ago

I'm building a CNN-LSTM network in Keras (v2.02) + Tensorflow (v1.0.1) using video frames as input. I'm setting up the network as shown below:

import tensorflow as tf
import keras
import cv2

video = keras.layers.Input(shape=(None, 299,299,3),name='video_input')

cnn = keras.applications.InceptionV3(weights='imagenet',
                                 include_top='False',
                                 pooling='avg')

cnn.trainable = False
encoded_frame = keras.layers.TimeDistributed(cnn)(video)
encoded_vid = keras.layers.LSTM(256)(encoded_frame)
outputs = keras.layers.Dense(128, activation='relu')(encoded_vid)

Some of the tensor properties are below:

video
<tf.Tensor 'video_input:0' shape=(?, ?, 299, 299, 3) dtype=float32>

cnn.input
<tf.Tensor 'input_1:0' shape=(?, 299, 299, 3) dtype=float32>

cnn.output
<tf.Tensor 'predictions/Softmax:0' shape=(?, 1000) dtype=float32>    

encoded_frame
<tf.Tensor 'time_distributed_1/Reshape_1:0' shape=(?, ?, 1000) dtype=float32>

encoded_vid
<tf.Tensor 'lstm_1/TensorArrayReadV3:0' shape=(?, 256) dtype=float32>

outputs
<tf.Tensor 'dense_1/Relu:0' shape=(?, 128) dtype=float32>

Now I build the model and fit the data:

model = keras.models.Model(inputs=[video],outputs=outputs)
model.compile(optimizer='adam',
          loss='mean_squared_logarithmic_error')
# Generate random targets
y = np.random.random(size=(128,)) 
y = np.reshape(y,(-1,128))
model.fit(x=frame_sequence, y=y, validation_split=0.0,shuffle=False, batch_size=1)

where frame_sequence is a sequence of video frames from one video:

frame_sequence.shape
(1, 48, 299, 299, 3)

All seems well up to the training step model.fit, where I get an error attributed to the input_1 placeholder in the InceptionV3 model input:

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1' with dtype float
 [[Node: input_1 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Training works without error if I build my CNN from scratch instead of loading InceptionV3. For example, replacing InceptionV3 with:

cnn = Sequential()
cnn.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(229, 229, 3)))
cnn.add(Conv2D(64, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
cnn.add(Conv2D(128, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
cnn.add(Conv2D(256, (3, 3), activation='relu'))
cnn.add(Conv2D(256, (3, 3), activation='relu'))
cnn.add(MaxPooling2D((2, 2)))
cnn.add(Flatten())

Here is some minimal code to reproduce the issue.

matt-gardner commented 7 years ago

I'm seeing the same error, though in a very different context. I have an implementation of Bidirectional Attention Flow (BiDAF) in Keras. I want to load that model, similar to how @Feynman27 is loading InceptionV3, then pull parts out of it for use in a model on a different, but similar task (see code here). This worked in Keras 1, but is breaking when porting our code to Keras 2. You can see a trace of the failure here. I spent several hours trying to figure out what is going on here, and I'm at a wall. I thought that the __call__ method might not be hooking up the inputs correctly, because there's an extra placeholder still lying around, but I made a minimal example trying to show that something is broken, and it actually works. I'm at a total loss for why my example doesn't have the same crash that my real model has.

I very much want this bug to be fixed - I'm happy to help debug, if anyone has suggestions on what to do. I've run out of ideas.

MuOtb commented 7 years ago

Why do u define the layers as ( keras.layers.Input ) etc?

Why don’t u use this style:

from keras.models import Model from keras.layers import Input, Dense

a = Input(shape=(32,)) b = Dense(32)(a) model = Model(inputs=a, outputs=b)

This is could be the error?

I am trying to do similar model:

input ( video frames ) > Conv > tLSTM

the conv network alone works, but LSTM is not working.

nelson-liu commented 7 years ago

@MuOtb they're functionally identical, some people just don't like having multiple imports at the top of their file.

Feynman27 commented 7 years ago

One of my colleagues referred me to this issue encountered when using BN in a sub-model applied to a time-distributed layer. Looks like a similar issue, but at the moment it's still unclear to me how to apply that recommendation to the case above with Inception.

alfiya400 commented 7 years ago

@Feynman27 Potential workaround: set learning phase to 0 and pass Lambda(lambda x: cnn(x)) to TimeDistributed. Here's the full example: https://gist.github.com/alfiya400/9d3bf303966f87a3c2aa92a0a0a54662

I also checked that output from TimeDistributed and cnn.predict match each other.

The drawback of this approach: you can't use Dropouts and there might be other restrictions when learning_phase is set to 0.

matt-gardner commented 7 years ago

@alfiya400, do you have any idea why that workaround works for the CNN? I tried those in my model, and it does not solve the issue.

buptss commented 7 years ago

@alfiya400 How do you try this out? I'm just going to try your method, and hope it works me out.

Feynman27 commented 7 years ago

@buptss Just follow her gist link above. It worked for me but not @matt-gardner. We're trying to figure out why, but my suspicion is that it has something to do with the batch normalization, and the Lambda function is instantiating a new BN instance for each CNN output.

alfiya400 commented 7 years ago

@matt-gardner I think there are two problems here(in TimeDistributed over InceptionV3 probem):

This is happening because the uses_learning_phase parameter is not the same for cnn and model

model.uses_learning_phase=False
cnn.uses_learning_phase=True  # (because of the BatchNorm layer)

If you call model.fit it first builds the list of inputs using this code and fails to add K.learning_phase into list of inputs cause model.uses_learning_phase=False. So... to fix that you could set learning_phase to 0 or use Dropouts/BatchNorm on all your models. (probably using Dropout(0.0001) could be a workaround...)

matt-gardner commented 7 years ago

Yeah, I tried wrapping the TimeDistributed part of my model in a Lambda after your first comment, but it didn't work. I just now tried also wrapping the other Model that I use in a Lambda, and that didn't work, either. I still get the missing placeholder error (on an input tensor, not a batch norm tensor, so it's the second issue you mention, not the first). If I knew why adding the Lambda helps in your case, maybe I could figure out what's going wrong in my case, because I'm pretty sure they're related...

Wenbo93 commented 7 years ago

I met the same error when applying TimeDistributed to InceptionV3. I also think it due to the compatibility of TimeDistributed and BatchNormalization, because I didn't met this when using TimeDistributed to wrap VGG16 which does not have BN layer. @alfiya400 Thanks for your solution! It works at least for now in my project.

QuantumLiu commented 7 years ago

I got the same problem. I'm agree with @Wenbo93 .I think it due to the compatibility of TimeDistributed and BatchNormalization.This is my code.I used BN in a TimeDistributed CNN.

    convs= Sequential()
    convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 1, 1,input_shape=shape[1:], border_mode="same", bias=False,activation='relu'))
    convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
    convs.add(BatchNormalization(axis=3))
    convs.add(MaxPooling2D((2,2),border_mode='same'))
    for l_cnn in range(1,nb_conv_layers):
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 1, 1, border_mode="same", bias=False,activation='relu'))
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
        convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
        convs.add(BatchNormalization(axis=3))
        convs.add(MaxPooling2D((2,2),border_mode='same'))
    convs.add(Flatten())
    #Warp the cnn and conect it with a rnn
    out=TimeDistributed(convs)(inputs)
    for l_rnn in range(nb_rnn_layers-1):
        out=LSTM(512,return_sequences=True,activation='relu',stateful=stateful)(out)
    out=LSTM(512,return_sequences=False,activation='relu',stateful=stateful)(out)
    out=Dropout(0.2)(out)
    out=Dense(1024,activation='relu')(out)
    out=Dropout(0.2)(out)
    out=Dense(1,activation=activation)(out)
    tdcnn=Model(input=[inputs],output=[out])
StefPac commented 7 years ago

I have a small example that reproduces the problem.

nb_samples = 50
input_a_len = 50
X = np.ones((nb_samples, 2, input_a_len), dtype=np.float32)
Y = np.ones((nb_samples, 2, 1), dtype=np.float32)
input_a = Input(shape=(2, input_a_len), name='input_a', dtype='float32')
input_a_reshaped = Reshape((2, input_a_len, 1))(input_a)
pred = TimeDistributed(LSTM(1, recurrent_dropout=0.1))(input_a_reshaped)
model = Model([input_a], pred)
model.compile(loss='binary_crossentropy', optimizer='sgd')
hist = model.fit(x=X, y=Y)

This produces the error:

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
         [[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

In this case:

  1. adding K.set_learning_phase(1) explicitly, or
  2. removing the TimeDistributed (with dimensional adjustments) or
  3. removing the recurrent_dropout option, solves the problem.

solve the problem, but none of these workarounds seem an acceptable solution.

avn3r commented 7 years ago

@StefPac Yeah I have the same error. Adding set_learning_phase(1) solves it. What issues or consequences could arise from hard-coding this value when training (with validation) a model.

gewoonrik commented 7 years ago

@abnera in that case Dropout will also drop out neurons during validation, for example.

avn3r commented 7 years ago

@gewoonrik Thanks for the explanation. Yeah, I am getting very poor validation results by hard-coding the learning_phase: set_learning_phase(1).

redsphinx commented 7 years ago

@QuantumLiu did you manage to find a workaround for the problem with the batchnorm layer? I am also encoutering problems when trying to make TimeDistributed(BatchNormalization())(input) which gives

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
     [[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: Mean_3/_33 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1296_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
tRosenflanz commented 7 years ago

@gewoonrik I've circumvented this behavior by recreating the model without the dropout layers and reloading the weights into it - it loads and the predictions are stable indicating that dropout is not applied. Batch norm layer is not so easy though - it has weights so if you drop the layers the weights won't be loaded due to layer mismatch between "training model" and "predictive" one. I am thinking this could be circumvented by creating another layer that follows the same structure as Batch Norm but returns the same value when in_train_phase is called.

elternativeht commented 7 years ago

I met the same issue in a different context. In my model I tried to produce the model to learn adaptively from Chinese character vectors to word vectors and further to word properties.

def HBLSTM4POS(maxword_per_sen=20,maxchar_per_word=8,word_vec_dim=52,pos_num=26):
    InputLayers = Input(shape=(maxword_per_sen,maxchar_per_word,word_vec_dim),name='InputTensor')
    Posmask = TimeDistributed(Masking(mask_value=0.0,input_shape=(8,52)),input_shape=(20,8,52))(InputLayers)
    WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
    POS_LSTM1 = Bidirectional(LSTM(52,return_sequences=True))(WordLayer)
    POS_LSTM2 = Bidirectional(LSTM(52,return_sequences=True))(POS_LSTM1)
    Dense1 = TimeDistributed(Dense(POS_NUM*3,activation='relu'))(POS_LSTM2)
    Dense2 = TimeDistributed(Dense(POS_NUM,activation='softmax',name='POS_Output'))(Dense1)
    model = Model(inputs=InputLayers, outputs=Dense2)
    model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
    return model

train_data = np.random.rand(100,20,8,52)
train_y = np.random.randint(26,size=(100,20,26))
model = HBLSTM4POS()
model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)

Full error message:

2017-09-10 20:42:29.012323: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
    return fn(*args)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "integrated_model.py", line 36, in <module>
    model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1507, in fit
    initial_epoch=initial_epoch)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1156, in _fit_loop
    outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2269, in __call__
    **self.session_kwargs)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'time_distributed_2/keras_learning_phase', defined at:
  File "integrated_model.py", line 35, in <module>
    model = HBLSTM4POS()
  File "integrated_model.py", line 15, in HBLSTM4POS
    WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 596, in __call__
    output = self.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 177, in call
    y = self.layer.call(inputs)  # (num_samples * timesteps, ...)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 263, in call
    y = self.forward_layer.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 333, in call
    preprocessed_input = self.preprocess_input(inputs, training=None)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 1077, in preprocess_input
    timesteps, training=training)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 46, in _time_distributed_dense
    x = K.in_train_phase(x * expanded_dropout_matrix, x, training=training)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2602, in in_train_phase
    training = learning_phase()
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 115, in learning_phase
    name='keras_learning_phase')
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
    name=name)
  File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
     [[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
     [[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

When I deleted the inherent dropout parameter in LSTM layers the error got disappeared. So I wonder the TimeDistirbuted wrapper still had some trouble in wrapping dropout.

creotiv commented 7 years ago

Try instead encoded_frame = keras.layers.TimeDistributed(cnn)(video) use encoded_frame = keras.layers.TimeDistributed(cnn.outputs[0])(video)

pawanvirsingh commented 6 years ago
raise ValueError("Tensor %s is not an element of this graph." % obj)

ValueError: Tensor Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32) is not an element of this graph. in keras VGG16 model

alvi75 commented 3 years ago

InvalidArgumentError: indices[0,19] = 87890 is not in [0, 35173) [{{node embedding_2/embedding_lookup}}]]

COLAB https://colab.research.google.com/drive/1vXAswtbuCf-wXQNc3mUQROlFzUUssBs_?usp=sharing DATASET [ner_dataset] 15MB https://www.kaggle.com/abhinavwalia95/entity-annotated-corpus?select=ner_dataset.csv BiLSTM-CRF Model

input = Input(shape=(MAX_LEN,))
model = Embedding(input_dim=n_words + 1, output_dim=20,
              input_length=MAX_LEN, mask_zero=True)(input)  # 20-dim embedding
model = Bidirectional(LSTM(units=50, return_sequences=True,
                       recurrent_dropout=0.1))(model)  # variational biLSTM
model = TimeDistributed(Dense(50, activation="relu"))(model)  # a dense layer as suggested by 
neuralNer
crf = CRF(18)  # CRF layer
out = crf(model)  # output
model = Model(input, out)
model.compile(optimizer="rmsprop", loss=crf.loss_function, metrics=[crf.accuracy])
model.summary()
history = model.fit(tr_inputs, np.array(tr_tags), batch_size=32, epochs=5,
                validation_split=0.1, verbose=1)

Was trying BERT embedding with BiLSTM-CRF model but couldn't fix this issue. Using bert-base-multilingual-uncased tokenizer I am having this InvalidArgumentError error. Have ran with bert-base-cased tokenizer with NULL error.

Train on 38846 samples, validate on 4317 samples
Epoch 1/5
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-71-d239a56e5cc7> in <module>()
      1 history = model.fit(tr_inputs, np.array(tr_tags), batch_size=32, epochs=5,
----> 2                     validation_split=0.1, verbose=1)

4 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in __call__(self, *args, 
**kwargs)
   1456         ret = tf_session.TF_SessionRunCallable(self._session._session,
   1457                                                self._handle, args,
-> 1458                                                run_metadata_ptr)
   1459         if run_metadata:
   1460           proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

InvalidArgumentError: indices[0,19] = 87890 is not in [0, 35173)
[[{{node embedding_2/embedding_lookup}}]]

@Feynman27 @alfiya400 @StefPac please help....