Find small dimension typo in reading_comprehension/bidirectional_attention.py

allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

Apache License 2.0

404 stars 132 forks source link

Find small dimension typo in reading_comprehension/bidirectional_attention.py #412

Closed felizxia closed 6 years ago

felizxia commented 6 years ago

Hi, first of all, thank you very much for your work!

While implementing your code, I found in bidirectional_attention.py line 128, passage_question_vectors = weighted_sum_layer([encoded_question, passage_question_attention])

should be = weighted_sum_layer([encoded_passage, passage_question_attention])

Because both C2Q or Q2C attention should be 2d T, where T is the max_len of passage instead of questions. Only to do so, the end of the merged attention weights would be 8d T as mentioned in the original paper.

Thank you!

matt-gardner commented 6 years ago

Hi @felizxia - this repository is no longer supported, as we've moved all of our development efforts to AllenNLP. But, that line of code is actually correct, because of how the WeightedSum works. The given attention tensor has three dimensions: (batch_size, num_passage_words, num_question_words). WeightedSum uses the last dimension as weights for summing the encoded_passage, and returns a tensor of shape (batch_size, num_passage_words, encoding_dim). It's a sum over the encoded question, but it's done for each passage word, so the shapes end up right.

felizxia commented 6 years ago

Got it! Thank you very much for clarification!! I am also wondering have your team also posted any solutions to bi-direction-attention method for cloze test? Like for CNN and DailyMaily test set where only entity is the output? I guess I am very curious of where is the loss function located, and if I can change loss function to sum up of each entities p1 probabilities instead of p1 and p2 probabilities..

Thank you!

matt-gardner commented 6 years ago

We have this model: https://github.com/allenai/deep_qa/blob/master/deep_qa/models/reading_comprehension/attention_sum_reader.py. But, again, this library is not supported, I don't remember how to run this anymore, or what data you need to use with this. Good luck!