A question about parameters update

jazzsaxmafia / show_attend_and_tell.tensorflow

BSD 2-Clause "Simplified" License

506 stars 191 forks source link

A question about parameters update #12

Open automan000 opened 7 years ago

automan000 commented 7 years ago

In the following code, it seems that the parameter 'c' is never used. `

        lstm_preactive = tf.matmul(h, self.lstm_U) + x_t + tf.matmul(weighted_context,self.image_encode_W)

        i, f, o, new_c = tf.split(1, 4, lstm_preactive)

        i = tf.nn.sigmoid(i)
        f = tf.nn.sigmoid(f)
        o = tf.nn.sigmoid(o)
        new_c = tf.nn.tanh(new_c)
        c = f * c + i * new_c
        h = o * tf.nn.tanh(new_c)`

Why the parameter 'h' depends on 'new_c' rather than 'c'? In my opinion, i think the updating procedures should be c(t) = f(t) * c(t−1) + i(t) new_c(t) h(t) = o(t) \ tanh(c(t))

davidsonic commented 7 years ago

Yeah, I also think that the update should be h=otan(c) instead of h=otan(g)

shaoxuan92 commented 7 years ago

Hello, I don't quite understand the meaning of x_t . Could you give me some hints? Thank you!

Wind-Ward commented 6 years ago

Yes,the author in this package was wrong, @automan000 you are right!

sjksong commented 5 years ago

I use the author's original model （didn't change the h(t) = o(t) * tanh(c(t)) ）after 12 epoch ,the current loss only reduced to 2.96379992 ，is it right ？the loss is so big that the generated words only have one item that can not join into a sentence @Wind-Ward Could I ask how many epoch did you use to train a model that the result is satisfactory after change the mistake you point ？ or without changing the mistake ，can I train a model that is satisfactory？ I would appreciate it if you can give me apply.I am a student from China ,not being good at English,sorry if I don't express well.