Closed phdowling closed 7 years ago
I have the same issue.
The following code:
input_layer = Input(batch_shape=batch_input_shape)
batch_input_shape = (None, nb_inputs)
encoder = RecurrentContainer(
stateful=stateful,
return_sequences=True
)
encoder.add(LSTMCell(
hidden_size_encoder,
batch_input_shape=batch_input_shape
))
for _ in range(1, depth_encoder):
encoder.add(Dropout(dropout))
encoder.add(LSTMCell(
hidden_size_encoder
))
encoder = Bidirectional(encoder, merge_mode=merge_mode)
encoded = encoder(input_layer)
batch_input_shape = (None, None, hidden_size_encoder)
decoder = RecurrentContainer(
decode=True,
stateful=stateful,
output_length=output_length
)
decoder.add(Dropout(
dropout, batch_input_shape=batch_input_shape
))
decoder.add(AttentionDecoderCell(
output_dim=hidden_size_decoder,
hidden_dim=hidden_size_decoder
))
for _ in range(1, depth_decoder):
decoder.add(Dropout(dropout))
decoder.add(LSTMDecoderCell(
output_dim=hidden_size_decoder,
hidden_dim=hidden_size_decoder
))
decoded = decoder(encoded)
output = TimeDistributed(
Dense(
output_dim,
activation=output_activation
)
)(decoded)
model = Model([input_layer], output)
Results to:
theano.gof.fg.MissingInputError: An input of the graph, used to compute dot(<TensorType(float32, matrix)>, HostFromGpu.0), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.
For some reason, if I manually write the code of the AttentionSeq2Seq (i.e. copy/paste the code of the AttentionSeq2Seq method into a different file and use the pasted code instead of the seq2seq.AttentionSeq2Seq) and add a TimeDistributedDense at the end, I get the missing input error.
If I add at the end of the seq2seq.AttentionSeq2Seq the TimeDistributed, then all is working fine!
So, what is happening?
I would like to be able to use the code inside the seq2seq.AttentionSeq2Seq method, because I want to specify the amount of cells for each layer of the encoder.
This is what I am also doing. I added the Embedded layer and the Dense inside the same model, and things are working. If I nest the models, things break. I'm not sure if this is meant to be supported by Keras, or if this is perhaps a bug in the RecurrentContainer code.
Doesn't seem to be an issue with Keras; since nesting usual models seems to be working. Will fix this soon.
Fixed.
Hi. Is not fixed. I tried again with the previously posted code and again the missing input error persists.
The code is:
encoder = RecurrentContainer()
encoder.add(LSTMCell())
for _in range():
encoder.add(Dropout())
encoder.add(LSTMCell())
input_layer = Input()
input_layer._keras_history[0].supports_masking = True
encoder = Bidirectional(encoder)
encoded = encoder(input_layer)
decoder = RecurrentContainer()
decoder.add(Dropout())
decoder.add(AttentionDecoderCell())
for _ in range():
decoder.add(Dropout())
decoder.add(LSTMDecoderCell())
decoded = decoder(encoded)
output = TimeDistributed(Dense())
model = Model(input_layer, output)
The error is:
theano.gof.fg.MissingInputError: An input of the graph, used to compute dot(<TensorType(float32, matrix)>, HostFromGpu.0), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.
The weird thing is that when used as:
m = AttentionSeq2Seq(
input_dim=self._input_nb_features,
input_length=self._input_length,
hidden_dim=self._hidden_size_words,
output_length=self._output_length,
output_dim=self._hidden_size_words,
depth=depth
)
model = Sequential()
model.add(m)
model.add(TimeDistributed(Dense()))
model.compile(loss='mse', optimizer='rmsprop')
works....
That means that we do not have any fine grained control on the layers of the encoder (e.g. different amount of cells for each layer or residual connections).
FYI, using this with seq2seq models directly works fine for me (apparently also with different depth parameters). I add embedding and convolution before the model, timedistributed dense with softmax after. Perhaps there's a deeper issue in recurrentshop that makes your code fail? Did you update both libraries?
If you mean the code that you posted:
I'm trying to use a Seq2Seq model as follows:
input = Input(shape=(maxlen,))
one_hot = Lambda(
lambda x: K.one_hot(K.cast(x, dtype="int32"), nb_classes=num_inputs), output_shape=(maxlen, num_inputs)
)(input)
output_seq = Seq2Seq(
input_shape=(maxlen, num_inputs),
hidden_dim=hidden_dim,
output_length=out_maxlen, output_dim=num_inputs,
depth=2, peek=True
)(one_hot)
predicted = TimeDistributed(Activation("softmax"))(output_seq)
model = Model(input, predicted)
return model
then, yes. This works.
But, I tried to reproduce the model of Attention Seq2Seq outside of the seq2seq package in order to have different amount of cells in each encoder layer (e.g.). This did not worked.
If I created the model from the Seq2Seq package and then added layers before and/or after it, then it works.
@dr-costas Post your actual code.
Hi
I just tried the following code:
input_length = 10
batch_size = None
output_length = 10
input_features = 32
lstm_cells = 32
dropout = .5
output_dim = 32
dense_output = 2
batch_shape = (batch_size, input_length, input_features)
x = np.random.rand(2, input_length, input_features)
y = np.random.rand(2, output_length, dense_output)
encoder = RecurrentContainer(input_length=input_length, return_sequences=True)
encoder.add(LSTMCell(lstm_cells, batch_input_shape=(batch_size, input_features)))
for _ in range(1, 3):
encoder.add(Dropout(dropout))
encoder.add(LSTMCell(lstm_cells))
input_layer = Input(batch_shape=batch_shape)
input_layer._keras_history[0].supports_masking = True
encoder = Bidirectional(encoder, merge_mode='sum')
encoded = encoder(input_layer)
decoder = RecurrentContainer(decode=True, output_length=output_length)
decoder.add(Dropout(dropout, batch_input_shape=batch_shape))
decoder.add(AttentionDecoderCell(output_dim=output_dim, hidden_dim=output_dim))
for _ in range(1, 3):
decoder.add(Dropout(dropout))
decoder.add(LSTMDecoderCell(output_dim=output_dim, hidden_dim=output_dim))
decoded = decoder(encoded)
output = TimeDistributed(Dense(dense_output))(decoded)
model = Model(input_layer, output)
model.compile(loss='mse', optimizer='rmsprop')
model.fit(x, y)
and it works.
Thnx.
Hi,
the exact previous code that I posted, I used it in a class. But it does not work.
The code is:
encoder = RecurrentContainer(input_length=None, return_sequences=True)
encoder.add(LSTMCell(self._hidden_size_1, batch_input_shape=(None, self._input_nb_features)))
for _ in range(1, self._depth_encoder):
encoder.add(Dropout(self._dropout))
encoder.add(LSTMCell(self._hidden_size_1))
input_layer = Input(batch_shape=(None, None, self._input_nb_features))
input_layer._keras_history[0].supports_masking = True
encoder = Bidirectional(encoder, merge_mode='sum')
encoded = encoder(input_layer)
decoder = RecurrentContainer(decode=True, output_length=self._output_length)
decoder.add(Dropout(self._dropout, batch_input_shape=(None, None, self._input_nb_features)))
decoder.add(AttentionDecoderCell(output_dim=self._hidden_size_2, hidden_dim=self._hidden_size_2))
for _ in range(1, self._depth_decoder):
decoder.add(Dropout(self._dropout))
decoder.add(LSTMDecoderCell(output_dim=self._hidden_size_2, hidden_dim=self._hidden_size_2))
decoded = decoder(encoded)
output = TimeDistributed(Dense(self._output_dim, activation=self._output_activation))(decoded)
self._model = Model(input_layer, output)
The error is the usual missing input error.
Following the issue #131 , I used dropout = 0 and the error was removed.
So, can we use as above the seq2seq and with dropout values greater than 0.0?
The problem is at fit function and not in predict function.
Hi, I have the same problem : I have a working Seq2Seq with unchanged dropout (so dropout = 0.), but when I try to set the dropout to 0.1 for exemple, the MissingInputException is raised when I fit the model. Did you find a way to get it working ?
from seq2seq import Seq2Seq
import numpy as np
X = np.random.rand(10,20,128)
y = X
model = Seq2Seq(input_dim=128,output_dim=128,output_length=20,hidden_dim=128,dropout=0.1)
model.compile(loss='mse',optimizer='rmsprop',metrics=['accuracy'])
model.fit(X,y,nb_epoch=5)
MissingInputError: ("An input of the graph, used to compute dot(<TensorType(float32, matrix)>, lstmcell_4_U), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", <TensorType(float32, matrix)>)
Your script works fine for me. Sure that you do not use early stopping ? https://github.com/farizrahman4u/seq2seq/issues/118#issuecomment-268564548
Is it possible I'm using early stopping without knowing it ?
Hi! Sorry for crossposting this (I also opened this issue on the Keras main repo), but I figured maybe it's actually related to seq2seq or recurrentshop internals.
I'm trying to use a Seq2Seq model as follows:
Which compiles fine, but when I try to fit the model using my (num_samples, maxlen) shaped matrix, Theano complains that
input_2
was not provided - which, as it turns out, is the input layer of the Seq2Seq model. I was hoping this layer would be fed the output of my Lambda layer automatically, but apparently this does not work. Is what I am trying to do possible? I realize I could just copy and slightly alter the Seq2Seq code, but of course I'd prefer just using the library for more maintainable code.More precise exception output: