Closed parshakova closed 6 years ago
Hi @digitnumber, if you look at the original paper, they added an extra bidirectional layer before the pointer network. Please check figure 1 of the paper. " After the original self-matching layer of the passage, we utilize bi-directional GRU to deeply integrate the matching results before feeding them into answer pointer layer. It helps to further propagate the information aggregated by self-matching of the passage." from section 4.2 Main Results.
Thanks for a quick reply!
Could you please also explain why you used indef call()
instead ofdef __call__()
in gated_attention_Wrapper
because I was getting an error
__call__ raise NotImplementedError("Abstract method")
I think RNN cell instance in tf has def call()
function. But not sure, changed to __call__
thanks!
why do you need
bidirectional_readout
? because in the paper they use outputs fromattention_match_rnn()
directly as the input topointer_network()