tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.39k stars 1.96k forks source link

AttributeError: 'LSTMStateTuple' object has no attribute 'get_shape' #174

Open JustinLin610 opened 6 years ago

JustinLin610 commented 6 years ago

I met this error when building the graph, and this is my code below for the encoding layer copied from stackoverflow, but it does not work for me...

def encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):
    """
    Create encoding layer
    :param rnn_inputs: Inputs for the RNN
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param keep_prob: Dropout keep probability
    :param source_sequence_length: a list of the lengths of each sequence in the batch
    :param source_vocab_size: vocabulary size of source data
    :param encoding_embedding_size: embedding size of source data
    :return: tuple (RNN output, RNN state)
    """
    embed = tf.contrib.layers.embed_sequence(rnn_inputs, source_vocab_size, encoding_embedding_size)
    Cell_f = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob) for _ in range(num_layers)])

    Cell_b = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob) for _ in range(num_layers)])

    ((encoder_fw_output, encoder_bw_output), (encoder_fw_state, encoder_bw_state)) = tf.nn.bidirectional_dynamic_rnn(cell_fw=Cell_f, cell_bw=Cell_b, inputs=embed, dtype=tf.float32, time_major=True)
    encoder_outputs = tf.concat((encoder_fw_output, encoder_bw_output), 2)

    encoder_states = []

    for i in range(num_layers):
        if isinstance(encoder_fw_state[i],tf.contrib.rnn.LSTMStateTuple):
            encoder_state_c = tf.concat(values=(encoder_fw_state[i].c,encoder_bw_state[i].c),axis=1,name="encoder_fw_state_c")
            encoder_state_h = tf.concat(values=(encoder_fw_state[i].h,encoder_bw_state[i].h),axis=1,name="encoder_fw_state_h")
            encoder_state = tf.contrib.rnn.LSTMStateTuple(c=encoder_state_c, h=encoder_state_h)
        elif isinstance(encoder_fw_state[i], tf.Tensor):
            encoder_state = tf.concat(values=(encoder_fw_state[i], encoder_bw_state[i]), axis=1, name='bidirectional_concat')

        encoder_states.append(encoder_state)

    encoder_states = tuple(encoder_states)

    return encoder_outputs, encoder_states
denisb411 commented 6 years ago

@JustinLin610 Hello, did you solve this issue? I'm having the same problem.

oahziur commented 6 years ago

@nave01314 Make sure you decoder_cell's number of layer same as the number of states you passing into it. It seems your example has two encoder layers (1 forward and 1 backward); however, your decoder only has 1 layer.

ruoruoliu commented 6 years ago

@oahziur I notice that the encoder_state is directly copy to decoder_initial_state. Does it make sense that the state from bi-directional lstm, which makes it two, can be applied to two layers of uni-directional lstm as initial state?

oahziur commented 6 years ago

@liuyujia1991 It should be possible, although I haven't tested it myself.