changywtw / Dual-Attention-Network

Tensorflow implementation of Dual Attention Network
20 stars 8 forks source link

You don't get the cell state of every step #2

Open wangyazhao001 opened 5 years ago

wangyazhao001 commented 5 years ago
def compute_init_memory(self, images, questions, ques_lens, keep_prob):  
    # ut    
    state_fw = tf.contrib.rnn.LSTMCell(self.rnn_size, use_peepholes=False).zero_state(self.batch_size, dtype=tf.float32)   
    state_bw = tf.contrib.rnn.LSTMCell(self.rnn_size, use_peepholes=False).zero_state(self.batch_size, dtype=tf.float32) 

    fw = [] 
    bw = [] 
    for i in range(self.max_words_q):
        with tf.variable_scope('forward_lstm') as scope: 
            if i>0: scope.reuse_variables()   
            ques_emb_linear_fw = tf.nn.embedding_lookup(self.embed_ques_W, questions[:,i])   
            _, state_fw = self.lstm_layer(self.rnn_size, ques_emb_linear_fw, state_fw, 'forward_lstm') 
            fw.append( tf.concat([state_fw.c,state_fw.h],1) ) 
            #fw.append( state_fw.h) 

Hi,brother!I think you don't get the cell state of every step because you just get every one lstm cell instead the cell state can't transfrom. Can you solve it?

changywtw commented 5 years ago

I'm not sure if this is what you're asking but I assumed that I got the cell states of every step by calling "state_fw.c" in the loop.

wangyazhao001 commented 5 years ago

Suppose you get the cell state at the last moment, then you should pass this cell state and the input of the next moment to the LSTM core to get the cell state at the next moment, but you don't pass the cell state, so I think Cell state you got is wrong because your cell state is not transmitted in time in the LSTMcell, the cell state input of each time step LSTMcell is 0.What dou you think?I really hope we can solve the problem!

changywtw commented 5 years ago

_, state_fw = self.lstm_layer(self.rnn_size, ques_emb_linear_fw, state_fw, 'forward_lstm')

In the line above I suppose I feed "state_fw" as the state at time t and "ques_emb_linear_fw" as the input at time t+1 to produce the new state "state_fw" at time t+1. However, to be honest, I'm currently not sure if this code works as I expect it to do.

wangyazhao001 commented 5 years ago

It looks like this.I will test it this weekend to see if the cell state is passing. Thank you for your explanation. Your two-stage attention among the three people I have read is the best. If I have questions, I will ask you later.Thank you really much!