Open Gu-Youngfeng opened 5 years ago
I find an available function named keras.preprocessing.sequence.pad_sequences()
can solve this problem, try the following code,
# import the keras
from keras.preprocessing.sequence import pad_sequences
# data_1
data_1 = [[[1, 2, 3, 4], [5, 6, 7, 8]],[[1, 2, 3, 4], [6, 7, 8, 9], [5, 1, 1, 2]],[[1, 2, 3, 4]]]
# data_2
data_2 = pad_sequences(data_1, padding='post', maxlen=3)
print(data_1)
print(data_2)
Its result looks like this,
Using TensorFlow backend.
[[[1 2 3 4]
[5 6 7 8]
[0 0 0 0]
[0 0 0 0]]
[[1 2 3 4]
[6 7 8 9]
[5 1 1 2]
[0 0 0 0]]
[[1 2 3 4]
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]]]
pad_sequence()
seems to solve the problem, but after you pad the sequential data, the LSTM will still calculate the loss along with the lstm cells. A safer method is to add the parameter sequence_length
to tf.nn.dynamic_rnn
, the partial code is as follows,
seq_length = tf.placeholder(tf.int32)
outputs, states = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state, sequence_length=seq_length)
sess = tf.Session()
feed = {
seq_length: 20,
#other feeds
}
sess.run(outputs, feed_dict=feed)
Note that the parameter sequence_length
in function tf.nn.dynmaic_rnn()
is not a integer value but a vector, that is, Tensorflow wanna know how many steps in each of your sequential data should be calculated.
The official explanation is in https://tensorflow.google.cn/api_docs/python/tf/nn/dynamic_rnn.
sequence_length: (optional) An int32/int64 vector sized [batch_size]. Used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for performance than correctness.
so the complete code is as follows,
# step-1: padding the train set and define the placeholder x and y
features_train = np.array(features_train) # features
features_train = pad_sequences(features_train, padding='post', maxlen=sequence_size) # padding with 0
labels_train = np.array(labels_train) # labels
# x has the shape of (4, 10, 45)
x = tf.placeholder(tf.float32, shape=(None, sequence_size, feature_size), name="features")
# y has the shape of (None, 1)
y = tf.placeholder(tf.float32, shape=(None,1), name="labels")
# step-2: set function to define the parameter sequence_length
# this function can calculate the non-zero value in each sequential data, for example,
# if seq = [[[1,2,3], [3,4,5]], [[3,4,5]], [[2,2,2],[4,5,6],[7,7,7],[8,9,0]]]
# then length = [2, 1, 4]
def cal_length(seq):
used = tf.sign(tf.reduce_max(tf.abs(seq), 2))
length = tf.reduce_sum(used, 1)
length = tf.cast(length, tf.int32)
return length
# step-3: build the LSTM model
lstm_cell = tf.nn.rnn_cell.LSTMCell(num_units=128)
outputs, state = tf.nn.dynamic_rnn(cell=lstm_cell, inputs=x, sequence_length=cal_length(x), dtype=tf.float32)
To construct an LSTM model, we have to pad the data with 0 to pre-process the sequential data with un-fixed lengths. For e.g., we wanna change the
data_1
todata_2
,