freesunshine0316 / MPQG

Code corresponding to our paper "Leveraging Context Information for Natural Question Generation"
47 stars 10 forks source link

encoder state in `matching_encoder_utils.py` #9

Open pjlintw opened 5 years ago

pjlintw commented 5 years ago

I read the paper "Leveraging Context Information for Natural Question Generation".

Section 2.2 says:

Each encoder state hj is the concatenation of two bi-directional LSTM states

(Section 3.2 also claim that the proposed decoder take the concatenation from BiLSTM. Same as in section 2.2 )



But in matching_encoder_utils.py, line 317 is showing that the encoder state concatenate aggregation_representation and in_passage_repres, which is output from filter layer (line 181). not the output of BiLSTM.

encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)



According to paper, the encoder hidden should be a representation of concatenated aggregation representation with cur_in_passage_repres (line 245), right?

Do I understand correctly? I am figturing out the difference. Anyone else can help?

freesunshine0316 commented 5 years ago

Hi @pjlintw

Section 3.2 clearly states that the attention memory in our proposed model has BOTH the Bi-LSTM encoding states AND the multi-perspective matching states. Please take a look at Equations 3 and 4.

freesunshine0316 commented 5 years ago
The decoder is identical to the one described
in Section 2.2, **except** that matching information
is added to the attention memory:
pjlintw commented 5 years ago

@freesunshine0316 Thank you for pointing out!

My question is: Why the matching encoder output concatenated with in_passage_repres , instead of cur_in_passage_repres?

here: encode_hiddens = tf.concat([aggregation_representation, in_passage_repres], 2)

Because equation (3) states that passage states are from bi-directional LSTM. When I read code, I was wondering it suppose to be cur_in_passage_repres in line 245, as it were calculated from Bi-LSTM, but in_passage_repres don't.

freesunshine0316 commented 5 years ago

It's been a long time... It looks like we should use cur_in_passage_repres, which may further increase the performance.