Open pjlintw opened 5 years ago
Hi @pjlintw
Section 3.2 clearly states that the attention memory in our proposed model has BOTH the Bi-LSTM encoding states AND the multi-perspective matching states. Please take a look at Equations 3 and 4.
The decoder is identical to the one described
in Section 2.2, **except** that matching information
is added to the attention memory:
@freesunshine0316 Thank you for pointing out!
My question is: Why the matching encoder output concatenated with in_passage_repres
, instead of cur_in_passage_repres
?
here: encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)
Because equation (3) states that passage states are from bi-directional LSTM.
When I read code, I was wondering it suppose to be cur_in_passage_repres
in line 245, as it were calculated from Bi-LSTM, but in_passage_repres
don't.
It's been a long time... It looks like we should use cur_in_passage_repres
, which may further increase the performance.
I read the paper "Leveraging Context Information for Natural Question Generation".
Section 2.2 says:
(Section 3.2 also claim that the proposed decoder take the concatenation from BiLSTM. Same as in section 2.2 )
But in
matching_encoder_utils.py
, line 317 is showing that the encoder state concatenateaggregation_representation
andin_passage_repres
, which is output from filter layer (line 181). not the output of BiLSTM.encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)
According to paper, the encoder hidden should be a representation of concatenated
aggregation representation
withcur_in_passage_repres
(line 245), right?Do I understand correctly? I am figturing out the difference. Anyone else can help?