Closed JuniorPan closed 7 years ago
Hi @LagrangePan I was wondering the same thing. I have filed a bug to the main tensorflow repository:
Hi I did some further research and found https://github.com/suriyadeepan/augmented_seq2seq/blob/01706d3869a42f3cf0bfe5c83f069646315a945e/bi_encoder.py
He uses tf.scan
to manually generate a state tuple. However, there is no way (yet) to get encoder_final_outputs
: https://github.com/suriyadeepan/augmented_seq2seq/issues/1
Just figure it out:
See code below. However, stacked LSTM may not work just yet.
enc_cells_fw = []
for i in range(0, encoder_depth):
with tf.variable_scope('enc_RNN_{}'.format(i)):
cell = tf.contrib.rnn.LSTMCell(hidden_dim) # Or LSTMCell(hidden_dim)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=1.0-dropout)
enc_cells_fw.append(cell)
enc_cell_fw = tf.contrib.rnn.MultiRNNCell(enc_cells_fw, state_is_tuple=True)
enc_cells_bw = []
for i in range(0, encoder_depth):
with tf.variable_scope('enc_RNN_{}'.format(i)):
cell = tf.contrib.rnn.LSTMCell(hidden_dim) # Or LSTMCell(hidden_dim)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=1.0-dropout)
enc_cells_bw.append(cell)
enc_cell_bw = tf.contrib.rnn.MultiRNNCell(enc_cells_bw, state_is_tuple=True)
init_state = enc_cell_fw.zero_state(batch_size=batch_size, dtype=tf.float32)
# transpose encoder inputs to time-major
enc_inp_t = tf.transpose(enc_inp, [1,0,2])
#
# der bi encoder
with tf.variable_scope('encoder-fw') as scope: # forward sequence
enc_output_fw, enc_states_fw = tf.scan(lambda (_, st_1), x : enc_cell_fw(x, st_1),
enc_inp_t, initializer=(tf.zeros(shape=[batch_size, hidden_dim]), init_state))
with tf.variable_scope('encoder-bw') as scope: # backward sequence
enc_output_bw, enc_states_bw = tf.scan(lambda (_, st_1), x : enc_cell_bw(x, st_1),
tf.reverse(enc_inp_t, axis=[0]), # <- reverse inputs
initializer=(tf.zeros(shape=[batch_size, hidden_dim]), init_state))
enc_output_fw = tf.transpose(enc_output_fw, [1,0,2])
enc_output_bw = tf.transpose(enc_output_bw, [1,0,2])
encoder_outputs = tf.concat([enc_output_fw, enc_output_bw], 2)
# project context
Wc = tf.get_variable('Wc', shape=[2, encoder_depth, hidden_dim*2, hidden_dim*2],
initializer=tf.contrib.layers.xavier_initializer())
# extract context [get final state; project c,h to [hidden_dim]; list->tuple]
encoder_final_state = []
for layer in range(encoder_depth):
enc_c = tf.concat( (enc_states_fw[layer].c[-1], enc_states_bw[layer].c[-1]), 1)
enc_c = tf.matmul(enc_c, Wc[0][layer])
enc_h = tf.concat( (enc_states_fw[layer].h[-1], enc_states_bw[layer].h[-1]), 1)
enc_h = tf.matmul(enc_h, Wc[1][layer])
encoder_final_state.append(tf.contrib.rnn.LSTMStateTuple(c = enc_c, h = enc_h))
# convert list to tuple - eww!
encoder_final_state = tuple(encoder_final_state)
thank you for your help
2017-06-22 4:38 GMT+08:00 Ricky Han notifications@github.com:
Just figure it out:
See code below. However, stacked LSTM may not work just yet.
enc_cells_fw = [] for i in range(0, encoder_depth): with tf.variable_scope('enc_RNN_{}'.format(i)): cell = tf.contrib.rnn.LSTMCell(hidden_dim) # Or LSTMCell(hidden_dim) cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=1.0-dropout) enc_cells_fw.append(cell) enc_cell_fw = tf.contrib.rnn.MultiRNNCell(enc_cells_fw, state_is_tuple=True) enc_cells_bw = [] for i in range(0, encoder_depth): with tf.variable_scope('enc_RNN_{}'.format(i)): cell = tf.contrib.rnn.LSTMCell(hidden_dim) # Or LSTMCell(hidden_dim) cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=1.0-dropout) enc_cells_bw.append(cell) enc_cell_bw = tf.contrib.rnn.MultiRNNCell(enc_cells_bw, state_is_tuple=True) init_state = enc_cell_fw.zero_state(batch_size=batch_size, dtype=tf.float32) # transpose encoder inputs to time-major enc_inp_t = tf.transpose(enc_inp, [1,0,2]) # # der bi encoder with tf.variable_scope('encoder-fw') as scope: # forward sequence enc_output_fw, enc_states_fw = tf.scan(lambda (_, st_1), x : enc_cell_fw(x, st_1), enc_inp_t, initializer=(tf.zeros(shape=[batch_size, hidden_dim]), init_state)) with tf.variable_scope('encoder-bw') as scope: # backward sequence enc_output_bw, enc_states_bw = tf.scan(lambda (_, st_1), x : enc_cell_bw(x, st_1), tf.reverse(enc_inp_t, axis=[0]), # <- reverse inputs initializer=(tf.zeros(shape=[batch_size, hidden_dim]), init_state)) enc_output_fw = tf.transpose(enc_output_fw, [1,0,2]) enc_output_bw = tf.transpose(enc_output_bw, [1,0,2]) encoder_outputs = tf.concat([enc_output_fw, enc_output_bw], 2) # project context Wc = tf.get_variable('Wc', shape=[2, encoder_depth, hidden_dim*2, hidden_dim*2], initializer=tf.contrib.layers.xavier_initializer()) # extract context [get final state; project c,h to [hidden_dim]; list->tuple] encoder_final_state = [] for layer in range(encoder_depth): enc_c = tf.concat( (enc_states_fw[layer].c[-1], enc_states_bw[layer].c[-1]), 1) enc_c = tf.matmul(enc_c, Wc[0][layer]) enc_h = tf.concat( (enc_states_fw[layer].h[-1], enc_states_bw[layer].h[-1]), 1) enc_h = tf.matmul(enc_h, Wc[1][layer]) encoder_final_state.append(tf.contrib.rnn.LSTMStateTuple(c = enc_c, h = enc_h)) # convert list to tuple - eww! encoder_final_state = tuple(encoder_final_state)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sayondutta/text_summarizer/issues/1#issuecomment-310198077, or mute the thread https://github.com/notifications/unsubscribe-auth/ATtQj7A6v4ZUOu4V0Tz9ix-MKnxxW70Cks5sGX9OgaJpZM4N6AAZ .
I have some confuse in the loop_fn function, def loop_fn_transition(time,previous_output,previous_state,previous_loop_state):
''''yourcode'''''
state = previous_state # why this just return previous_state
output = previous_output # and this too???
#print output.shape
loop_state = None
return (elements_finished,
next_input,
state,
output,
loop_state)
for all the steps or only for initial step ?
all the steps
SAYON DUTTA notifications@github.com于2017年6月23日 周五15:12写道:
for all the steps or only for initial step ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sayondutta/text_summarizer/issues/1#issuecomment-310590363, or mute the thread https://github.com/notifications/unsubscribe-auth/ATtQj5BIQGM_LA8rN0P0JY56xqEAGKFzks5sG2U-gaJpZM4N6AAZ .
Let me recheck it ᐧ
On Fri, Jun 23, 2017 at 2:51 PM, LagrangePan notifications@github.com wrote:
all the steps
SAYON DUTTA notifications@github.com于2017年6月23日 周五15:12写道:
for all the steps or only for initial step ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sayondutta/text_summarizer/issues/1#issuecomment- 310590363, or mute the thread https://github.com/notifications/unsubscribe-auth/ATtQj5BIQGM_ LA8rN0P0JY56xqEAGKFzks5sG2U-gaJpZM4N6AAZ .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sayondutta/text_summarizer/issues/1#issuecomment-310617385, or mute the thread https://github.com/notifications/unsubscribe-auth/ABIhOvDJYEn72HNypymJSMYvqIfk_Zdwks5sG4OggaJpZM4N6AAZ .
-- Regards and Thanks
Sayon Dutta Cofounder & VP - AI Research, Marax AI *B.Tech. (Hons), IIT Kharagpur mob. no.: +917007052294 , *+917406608296
actually these two will give the same results which is the state(here output inside the function is not the actual output but state), for actual output calculation you can see after raw_rnn is implemented for all the decoder steps the attention is applied again. Inside RNN attention is applied only to obtain next input.
i want to know how to use multi layers by your code