Closed felizxia closed 6 years ago
Hi @felizxia - this repository is no longer supported, as we've moved all of our development efforts to AllenNLP. But, that line of code is actually correct, because of how the WeightedSum
works. The given attention tensor has three dimensions: (batch_size, num_passage_words, num_question_words)
. WeightedSum
uses the last dimension as weights for summing the encoded_passage
, and returns a tensor of shape (batch_size, num_passage_words, encoding_dim)
. It's a sum over the encoded question, but it's done for each passage word, so the shapes end up right.
Got it! Thank you very much for clarification!! I am also wondering have your team also posted any solutions to bi-direction-attention method for cloze test? Like for CNN and DailyMaily test set where only entity is the output? I guess I am very curious of where is the loss function located, and if I can change loss function to sum up of each entities p1 probabilities instead of p1 and p2 probabilities..
Thank you!
We have this model: https://github.com/allenai/deep_qa/blob/master/deep_qa/models/reading_comprehension/attention_sum_reader.py. But, again, this library is not supported, I don't remember how to run this anymore, or what data you need to use with this. Good luck!
Hi, first of all, thank you very much for your work!
While implementing your code, I found in bidirectional_attention.py line 128, passage_question_vectors = weighted_sum_layer([encoded_question, passage_question_attention])
should be = weighted_sum_layer([encoded_passage, passage_question_attention])
Because both C2Q or Q2C attention should be 2d T, where T is the max_len of passage instead of questions. Only to do so, the end of the merged attention weights would be 8d T as mentioned in the original paper.
Thank you!