Currie32 / Spell-Checker

A seq2seq model that can correct spelling mistakes.
213 stars 93 forks source link

DynamicAttentionWrapper' #3

Closed kiranvarghesev closed 6 years ago

kiranvarghesev commented 6 years ago

dec_cell = tf.contrib.seq2seq.DynamicAttentionWrapper(dec_cell,

AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'

FurKan7 commented 6 years ago

you need to use tensorflow 1.1 version

pakgya commented 6 years ago

use AttentionWrapper and covert initial state to: initial_state=dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)

yashugupta786 commented 6 years ago

Hi i have changed the DynamicAttentionWrapper to dec_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell, attn_mech, rnn_size) and intial state to initial_state = dec_cell.zero_state(batch_size=batch_size, dtype=tf.float32).clone(cell_state=enc_state)

but i am getting errors like initial_state = dec_cell.zero_state(batch_size=batch_size, dtype=tf.float32).clone(cell_state=enc_state)

please help

pranav774 commented 6 years ago

I am using tensorflow version 1.8 , I have changed the DynamicAttentionWrapper to AttentionWrapper. but i am getting below error .. new() missing 4 required positional arguments: 'time', 'alignments', 'alignment_history', and 'attention_state'

in def seq2seq_model. Please can you help me ?

Olivia-Meng commented 6 years ago

@pranav774 I have the same problem,If you have solved it, please tell me the method.

pranav774 commented 6 years ago

No I did not... I struggling with it

Get Outlook for Androidhttps://aka.ms/ghei36


From: Developer-MengWu notifications@github.com Sent: Tuesday, July 3, 2018 9:09:09 PM To: Currie32/Spell-Checker Cc: Pranav Patil; Mention Subject: Re: [Currie32/Spell-Checker] DynamicAttentionWrapper' (#3)

@pranav774https://github.com/pranav774 I have the same problem,If you have solved it, please tell me the method.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Currie32/Spell-Checker/issues/3#issuecomment-402201099, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AjUp4sDiKTknzmJFsUq0q4Bmm6RQL_Yaks5uC5AdgaJpZM4Q9X--.

Sofwath commented 6 years ago

change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)

also

inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

and

traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

ashish9308 commented 6 years ago

@pranav774 Resolved or not?

asingh9530 commented 6 years ago

@pranav774 have you got any solution.

chaku88 commented 6 years ago

change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)

also

inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

and

traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.10.1). Thanks @Sofwath

asingh9530 commented 6 years ago

@Sofwath Thanks it worked

HariWu1995 commented 6 years ago

@Sofwath Thanks so much. It works for me.

SumITUIC commented 5 years ago

change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.10.1). Thanks @Sofwath

@chaku88 @Abhinavfreecodecamp @Sofwath @HariWu1995 I did these changes to decoding_layer function. But it still gives me error: The two structures don't have the same nested structure.

Please help.

fendyaugust commented 5 years ago

Can i ask about how to use the model again for testing When we already trained it in anpther computer? Please help :(

pgup2501 commented 5 years ago

@SumITUIC @HariWu1995 I made the changes as described, however gettign error. Tensorflow version : 1.10.0

TypeError: The two structures don't have the same nested structure.

First structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=LSTMStateTuple(c=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state:0' shape=(64, 256) dtype=float32>, h=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state_1:0' shape=(64, 256) dtype=float32>), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float32>)

Second structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_4:0' shape=(?, 256) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_4:0' shape=(?, 256) dtype=float32>)), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float32>)

g2sgautam commented 5 years ago

@Abhinavfreecodecamp @chaku88 @HariWu1995 Can you please provide full code

dikshakhanna commented 5 years ago

attention_keys, attention_values, attention_score_function, attention_construct_function = tf.contrib.seq2seq.AttentionWrapper(attention_states,attention_option = "bahdanau", num_units=decoder_cell.output_size)

TypeError: init() got an unexpected keyword argument 'attention_option

Can anyone solve this issue??

kinalmehta commented 5 years ago

replace the complete "decoding_layer" function as :

def decoding_layer(dec_embed_input, embeddings, enc_output, enc_state, vocab_size, inputs_length, targets_length, 
                   max_target_length, rnn_size, vocab_to_int, keep_prob, batch_size, num_layers, direction):
    '''Create the decoding cell and attention for the training and inference decoding layers'''

    with tf.name_scope("RNN_Decoder_Cell"):
        for layer in range(num_layers):
            with tf.variable_scope('decoder_{}'.format(layer)):
                lstm = tf.contrib.rnn.LSTMCell(rnn_size)
                dec_cell = tf.contrib.rnn.DropoutWrapper(lstm, 
                                                         input_keep_prob = keep_prob)

    output_layer = Dense(vocab_size,
                         kernel_initializer = tf.truncated_normal_initializer(mean = 0.0, stddev=0.1))

    attn_mech = tf.contrib.seq2seq.BahdanauAttention(rnn_size,
                                                  enc_output,
                                                  inputs_length,
                                                  normalize=False,
                                                  name='BahdanauAttention')

    with tf.name_scope("Attention_Wrapper"):
        dec_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell,
                                                              attn_mech,
                                                              rnn_size)

    initial_state =  dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)
#     initial_state = tf.contrib.seq2seq.DynamicAttentionWrapperState(enc_state,
#                                                                     _zero_state_tensors(rnn_size, 
#                                                                                         batch_size, 
#                                                                                         tf.float32))

    with tf.variable_scope("decode"):
#         training_logits = training_decoding_layer(dec_embed_input, 
#                                                   targets_length, 
#                                                   dec_cell, 
#                                                   initial_state,
#                                                   output_layer,
#                                                   vocab_size, 
#                                                   max_target_length)
        training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=dec_embed_input,
                                                            sequence_length=targets_length,
                                                            time_major=False)
        training_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                           training_helper,
                                                           initial_state,
                                                           output_layer) 
        training_logits, _ ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder,
                                            output_time_major=False,
                                            impute_finished=True,
                                            maximum_iterations=max_target_length)

    with tf.variable_scope("decode", reuse=True):
#         inference_logits = inference_decoding_layer(embeddings,  
#                                                     vocab_to_int['<GO>'], 
#                                                     vocab_to_int['<EOS>'],
#                                                     dec_cell, 
#                                                     initial_state, 
#                                                     output_layer,
#                                                     max_target_length,
#                                                     batch_size)
        start_tokens = tf.tile(tf.constant([vocab_to_int['<GO>']], dtype=tf.int32), [batch_size], name='start_tokens')
        end_token = (tf.constant(vocab_to_int['<EOS>'], dtype=tf.int32))
        inference_helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(embeddings,
                                                                    start_tokens,
                                                                    end_token)
        inference_decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell,
                                                            inference_helper,
                                                            initial_state,
                                                            output_layer)
        inference_logits, _ ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder,
                                            output_time_major=False,
                                            impute_finished=True,
                                            maximum_iterations=max_target_length)

    return training_logits, inference_logits
ThDarkKnight commented 5 years ago

change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.10.1). Thanks @Sofwath

I am running the code on Colab. Can anyone help solving below problem? After doing those changes, I am getting this error: (version problem?)


ValueError Traceback (most recent call last)

in () 5 num_layers, 6 threshold) ----> 7 model = build_graph(keep_probability, rnn_size, num_layers, batch_size, learning_rate, embedding_size, direction) 8 train(model, epochs, log_string) 2 frames /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py in __len__(self) 743 """Returns the rank of this shape, or raises ValueError if unspecified.""" 744 if self._dims is None: --> 745 raise ValueError("Cannot take the length of shape with unknown rank.") 746 return len(self._dims) 747 ValueError: Cannot take the length of shape with unknown rank.
consterwinter commented 5 years ago

@SumITUIC @HariWu1995 I made the changes as described, however gettign error. Tensorflow version : 1.10.0

TypeError: The two structures don't have the same nested structure.

First structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=LSTMStateTuple(c=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state:0' shape=(64, 256) dtype=float32>, h=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state_1:0' shape=(64, 256) dtype=float32>), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float32>)

Second structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_4:0' shape=(?, 256) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_4:0' shape=(?, 256) dtype=float32>)), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float

consterwinter commented 5 years ago

Have you solved this problem? I get same error too.

Kaarthic29 commented 5 years ago

Hi,

Any working solution to this problem AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'? I'm using the TensorFlow version: 1.14.0. I also tried a few workarounds as mentioned in the above comments but still could not succeed Any leads or help on this is greatly appreciated.

jianhui-ben commented 5 years ago

For AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'. Check out the answer above: change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "**DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.14).**

For another error: The two structures don't have the same nested structure. Go back the initial_state, change it to 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state[0])', instead of 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)'

sannge commented 4 years ago

@SumITUIC @HariWu1995 I made the changes as described, however gettign error. Tensorflow version : 1.10.0

TypeError: The two structures don't have the same nested structure.

First structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=LSTMStateTuple(c=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state:0' shape=(64, 256) dtype=float32>, h=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state_1:0' shape=(64, 256) dtype=float32>), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float32>)

Second structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_4:0' shape=(?, 256) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_4:0' shape=(?, 256) dtype=float32>)), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float

@SumITUIC @HariWu1995 I made the changes as described, however gettign error. Tensorflow version : 1.10.0

TypeError: The two structures don't have the same nested structure.

First structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=LSTMStateTuple(c=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state:0' shape=(64, 256) dtype=float32>, h=<tf.Tensor 'AttentionWrapperZeroState/checked_cell_state_1:0' shape=(64, 256) dtype=float32>), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float32>)

Second structure: type=AttentionWrapperState str=AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/fw/fw/while/Exit_4:0' shape=(?, 256) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_3:0' shape=(?, 256) dtype=float32>, h=<tf.Tensor 'encoder_1/bidirectional_rnn/bw/bw/while/Exit_4:0' shape=(?, 256) dtype=float32>)), attention=<tf.Tensor 'AttentionWrapperZeroState/zeros_2:0' shape=(64, 256) dtype=float32>, time=<tf.Tensor 'AttentionWrapperZeroState/zeros_1:0' shape=() dtype=int32>, alignments=<tf.Tensor 'AttentionWrapperZeroState/zeros:0' shape=(64, ?) dtype=float32>, alignment_history=(), attention_state=<tf.Tensor 'AttentionWrapperZeroState/zeros_3:0' shape=(64, ?) dtype=float

did you solve the problem by any chance? I got the same issue here.

splendidbug commented 4 years ago

For AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'. Check out the answer above: change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "**DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.14).**

For another error: The two structures don't have the same nested structure. Go back the initial_state, change it to 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state[0])', instead of 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)'

This worked well, thanks. PS: It gave an error related to max_target_length, so just replace it with max_summary_length

ayoubzeis commented 3 years ago

For AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'. Check out the answer above: change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "**DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.14).**

For another error: The two structures don't have the same nested structure. Go back the initial_state, change it to 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state[0])', instead of 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)'

thanks it,s works for me you save me thanks a lot

spandanag333 commented 3 years ago

For AttributeError: module 'tensorflow.contrib.seq2seq' has no attribute 'DynamicAttentionWrapper'. Check out the answer above: change to: initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state) also inferencelogits, ,_ = tf.contrib.seq2seq.dynamic_decode(inference_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length) and traininglogits, ,_ = tf.contrib.seq2seq.dynamic_decode(training_decoder, output_time_major=False, impute_finished=True, maximum_iterations=max_target_length)

These changes along with "**DynamicAttentionWrapper to AttentionWrapper." works with latest tensorflow version (1.14).**

For another error: The two structures don't have the same nested structure. Go back the initial_state, change it to 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state[0])', instead of 'initial_state = dec_cell.zero_state(batch_size=batch_size,dtype=tf.float32).clone(cell_state=enc_state)'

I have tried that and I am getting the below error. Please help me out. I am using tensorflow 1.14.0 TypeError: decoding_layer() missing 1 required positional argument: 'direction'