Closed Feynman27 closed 3 years ago
I'm seeing the same error, though in a very different context. I have an implementation of Bidirectional Attention Flow (BiDAF) in Keras. I want to load that model, similar to how @Feynman27 is loading InceptionV3, then pull parts out of it for use in a model on a different, but similar task (see code here). This worked in Keras 1, but is breaking when porting our code to Keras 2. You can see a trace of the failure here. I spent several hours trying to figure out what is going on here, and I'm at a wall. I thought that the __call__
method might not be hooking up the inputs correctly, because there's an extra placeholder still lying around, but I made a minimal example trying to show that something is broken, and it actually works. I'm at a total loss for why my example doesn't have the same crash that my real model has.
I very much want this bug to be fixed - I'm happy to help debug, if anyone has suggestions on what to do. I've run out of ideas.
Why do u define the layers as ( keras.layers.Input ) etc?
Why don’t u use this style:
from keras.models import Model from keras.layers import Input, Dense
a = Input(shape=(32,)) b = Dense(32)(a) model = Model(inputs=a, outputs=b)
This is could be the error?
I am trying to do similar model:
input ( video frames ) > Conv > tLSTM
the conv network alone works, but LSTM is not working.
@MuOtb they're functionally identical, some people just don't like having multiple imports at the top of their file.
One of my colleagues referred me to this issue encountered when using BN in a sub-model applied to a time-distributed layer. Looks like a similar issue, but at the moment it's still unclear to me how to apply that recommendation to the case above with Inception.
@Feynman27 Potential workaround: set learning phase to 0 and pass Lambda(lambda x: cnn(x))
to TimeDistributed
.
Here's the full example: https://gist.github.com/alfiya400/9d3bf303966f87a3c2aa92a0a0a54662
I also checked that output from TimeDistributed and cnn.predict
match each other.
The drawback of this approach: you can't use Dropouts and there might be other restrictions when learning_phase is set to 0.
@alfiya400, do you have any idea why that workaround works for the CNN? I tried those in my model, and it does not solve the issue.
@alfiya400 How do you try this out? I'm just going to try your method, and hope it works me out.
@buptss Just follow her gist link above. It worked for me but not @matt-gardner. We're trying to figure out why, but my suspicion is that it has something to do with the batch normalization, and the Lambda function is instantiating a new BN instance for each CNN output.
@matt-gardner I think there are two problems here(in TimeDistributed over InceptionV3 probem):
TimeDistributes(Lambda(lambda x: cnn(x)))
but don't set K.set_learning_phase(0)
I get an error like You must feed a value for placeholder tensor 'batch_normalization_1/keras_learning_phase'
. This is happening because the uses_learning_phase
parameter is not the same for cnn
and model
model.uses_learning_phase=False
cnn.uses_learning_phase=True # (because of the BatchNorm layer)
If you call model.fit
it first builds the list of inputs using this code and fails to add K.learning_phase
into list of inputs cause model.uses_learning_phase=False
. So... to fix that you could set learning_phase to 0 or use Dropouts/BatchNorm on all your models. (probably using Dropout(0.0001)
could be a workaround...)
TimeDistributed(cnn)
I get the You must feed a value for placeholder tensor 'input_1'
error. Using Lambda(lambda x: cnn(x))
helps, but I have no idea why... Probably in your case you should try wrapping your model into Lambda
..Yeah, I tried wrapping the TimeDistributed
part of my model in a Lambda
after your first comment, but it didn't work. I just now tried also wrapping the other Model
that I use in a Lambda
, and that didn't work, either. I still get the missing placeholder error (on an input tensor, not a batch norm tensor, so it's the second issue you mention, not the first). If I knew why adding the Lambda
helps in your case, maybe I could figure out what's going wrong in my case, because I'm pretty sure they're related...
I met the same error when applying TimeDistributed to InceptionV3. I also think it due to the compatibility of TimeDistributed and BatchNormalization, because I didn't met this when using TimeDistributed to wrap VGG16 which does not have BN layer. @alfiya400 Thanks for your solution! It works at least for now in my project.
I got the same problem. I'm agree with @Wenbo93 .I think it due to the compatibility of TimeDistributed and BatchNormalization.This is my code.I used BN in a TimeDistributed CNN.
convs= Sequential()
convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 1, 1,input_shape=shape[1:], border_mode="same", bias=False,activation='relu'))
convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
convs.add(BatchNormalization(axis=3))
convs.add(MaxPooling2D((2,2),border_mode='same'))
for l_cnn in range(1,nb_conv_layers):
convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 1, 1, border_mode="same", bias=False,activation='relu'))
convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
convs.add(Convolution2D(int(hidden_size/(2**(nb_conv_layers-l_cnn-1))), 3, 3, border_mode="same", bias=False,activation='relu'))
convs.add(BatchNormalization(axis=3))
convs.add(MaxPooling2D((2,2),border_mode='same'))
convs.add(Flatten())
#Warp the cnn and conect it with a rnn
out=TimeDistributed(convs)(inputs)
for l_rnn in range(nb_rnn_layers-1):
out=LSTM(512,return_sequences=True,activation='relu',stateful=stateful)(out)
out=LSTM(512,return_sequences=False,activation='relu',stateful=stateful)(out)
out=Dropout(0.2)(out)
out=Dense(1024,activation='relu')(out)
out=Dropout(0.2)(out)
out=Dense(1,activation=activation)(out)
tdcnn=Model(input=[inputs],output=[out])
I have a small example that reproduces the problem.
nb_samples = 50
input_a_len = 50
X = np.ones((nb_samples, 2, input_a_len), dtype=np.float32)
Y = np.ones((nb_samples, 2, 1), dtype=np.float32)
input_a = Input(shape=(2, input_a_len), name='input_a', dtype='float32')
input_a_reshaped = Reshape((2, input_a_len, 1))(input_a)
pred = TimeDistributed(LSTM(1, recurrent_dropout=0.1))(input_a_reshaped)
model = Model([input_a], pred)
model.compile(loss='binary_crossentropy', optimizer='sgd')
hist = model.fit(x=X, y=Y)
This produces the error:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
[[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
In this case:
K.set_learning_phase(1)
explicitly, orTimeDistributed
(with dimensional adjustments) or recurrent_dropout
option, solves the problem.solve the problem, but none of these workarounds seem an acceptable solution.
@StefPac Yeah I have the same error. Adding set_learning_phase(1)
solves it. What issues or consequences could arise from hard-coding this value when training (with validation) a model.
@abnera in that case Dropout will also drop out neurons during validation, for example.
@gewoonrik Thanks for the explanation. Yeah, I am getting very poor validation results by hard-coding the learning_phase: set_learning_phase(1)
.
@QuantumLiu did you manage to find a workaround for the problem with the batchnorm layer? I am also encoutering problems when trying to make TimeDistributed(BatchNormalization())(input)
which gives
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_1/keras_learning_phase' with dtype bool
[[Node: time_distributed_1/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: Mean_3/_33 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1296_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
@gewoonrik I've circumvented this behavior by recreating the model without the dropout layers and reloading the weights into it - it loads and the predictions are stable indicating that dropout is not applied. Batch norm layer is not so easy though - it has weights so if you drop the layers the weights won't be loaded due to layer mismatch between "training model" and "predictive" one. I am thinking this could be circumvented by creating another layer that follows the same structure as Batch Norm but returns the same value when in_train_phase is called.
I met the same issue in a different context. In my model I tried to produce the model to learn adaptively from Chinese character vectors to word vectors and further to word properties.
def HBLSTM4POS(maxword_per_sen=20,maxchar_per_word=8,word_vec_dim=52,pos_num=26):
InputLayers = Input(shape=(maxword_per_sen,maxchar_per_word,word_vec_dim),name='InputTensor')
Posmask = TimeDistributed(Masking(mask_value=0.0,input_shape=(8,52)),input_shape=(20,8,52))(InputLayers)
WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
POS_LSTM1 = Bidirectional(LSTM(52,return_sequences=True))(WordLayer)
POS_LSTM2 = Bidirectional(LSTM(52,return_sequences=True))(POS_LSTM1)
Dense1 = TimeDistributed(Dense(POS_NUM*3,activation='relu'))(POS_LSTM2)
Dense2 = TimeDistributed(Dense(POS_NUM,activation='softmax',name='POS_Output'))(Dense1)
model = Model(inputs=InputLayers, outputs=Dense2)
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])
return model
train_data = np.random.rand(100,20,8,52)
train_y = np.random.randint(26,size=(100,20,26))
model = HBLSTM4POS()
model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)
Full error message:
2017-09-10 20:42:29.012323: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
[[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
next(self.gen)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
[[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "integrated_model.py", line 36, in <module>
model.fit(train_data,train_y,batch_size=10,epochs=2,validation_split=0.1)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1507, in fit
initial_epoch=initial_epoch)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1156, in _fit_loop
outs = f(ins_batch)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2269, in __call__
**self.session_kwargs)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
[[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'time_distributed_2/keras_learning_phase', defined at:
File "integrated_model.py", line 35, in <module>
model = HBLSTM4POS()
File "integrated_model.py", line 15, in HBLSTM4POS
WordLayer = TimeDistributed(Bidirectional(LSTM(52,return_sequences=False,dropout=0.1,input_shape=(8,52),name='WordVector')),input_shape=(20,8,52))(Posmask)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 596, in __call__
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 177, in call
y = self.layer.call(inputs) # (num_samples * timesteps, ...)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py", line 263, in call
y = self.forward_layer.call(inputs, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 333, in call
preprocessed_input = self.preprocess_input(inputs, training=None)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 1077, in preprocess_input
timesteps, training=training)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 46, in _time_distributed_dense
x = K.in_train_phase(x * expanded_dropout_matrix, x, training=training)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2602, in in_train_phase
training = learning_phase()
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 115, in learning_phase
name='keras_learning_phase')
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
name=name)
File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ht/tensorflow3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'time_distributed_2/keras_learning_phase' with dtype bool
[[Node: time_distributed_2/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=<unknown>, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: mul_1/_41 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_12859_mul_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
When I deleted the inherent dropout
parameter in LSTM layers the error got disappeared. So I wonder the TimeDistirbuted
wrapper still had some trouble in wrapping dropout
.
Try instead encoded_frame = keras.layers.TimeDistributed(cnn)(video) use encoded_frame = keras.layers.TimeDistributed(cnn.outputs[0])(video)
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32) is not an element of this graph. in keras VGG16 model
COLAB https://colab.research.google.com/drive/1vXAswtbuCf-wXQNc3mUQROlFzUUssBs_?usp=sharing DATASET [ner_dataset] 15MB https://www.kaggle.com/abhinavwalia95/entity-annotated-corpus?select=ner_dataset.csv BiLSTM-CRF Model
input = Input(shape=(MAX_LEN,))
model = Embedding(input_dim=n_words + 1, output_dim=20,
input_length=MAX_LEN, mask_zero=True)(input) # 20-dim embedding
model = Bidirectional(LSTM(units=50, return_sequences=True,
recurrent_dropout=0.1))(model) # variational biLSTM
model = TimeDistributed(Dense(50, activation="relu"))(model) # a dense layer as suggested by
neuralNer
crf = CRF(18) # CRF layer
out = crf(model) # output
model = Model(input, out)
model.compile(optimizer="rmsprop", loss=crf.loss_function, metrics=[crf.accuracy])
model.summary()
history = model.fit(tr_inputs, np.array(tr_tags), batch_size=32, epochs=5,
validation_split=0.1, verbose=1)
Was trying BERT embedding with BiLSTM-CRF model but couldn't fix this issue. Using bert-base-multilingual-uncased tokenizer I am having this InvalidArgumentError error. Have ran with bert-base-cased tokenizer with NULL error.
Train on 38846 samples, validate on 4317 samples
Epoch 1/5
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-71-d239a56e5cc7> in <module>()
1 history = model.fit(tr_inputs, np.array(tr_tags), batch_size=32, epochs=5,
----> 2 validation_split=0.1, verbose=1)
4 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in __call__(self, *args,
**kwargs)
1456 ret = tf_session.TF_SessionRunCallable(self._session._session,
1457 self._handle, args,
-> 1458 run_metadata_ptr)
1459 if run_metadata:
1460 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
InvalidArgumentError: indices[0,19] = 87890 is not in [0, 35173)
[[{{node embedding_2/embedding_lookup}}]]
@Feynman27 @alfiya400 @StefPac please help....
I'm building a CNN-LSTM network in Keras (v2.02) + Tensorflow (v1.0.1) using video frames as input. I'm setting up the network as shown below:
Some of the tensor properties are below:
Now I build the model and fit the data:
where
frame_sequence
is a sequence of video frames from one video:All seems well up to the training step
model.fit
, where I get an error attributed to theinput_1
placeholder in the InceptionV3 model input:Training works without error if I build my CNN from scratch instead of loading InceptionV3. For example, replacing InceptionV3 with:
Here is some minimal code to reproduce the issue.