Thank you for providing the TensorFlow implementation.
One minor fix.
As per Equation (3) in the original paper, it should be addition:
h_j^~ <- \phi(U h_j + V w_j + W s_t)
rather than multiplication:
h_j^~ <- \phi(U h_j * V w_j * W s_t)
This can be further verified in their Torch code:
local candidate_memories = nn.PReLU(opt.edim)(nn.CAddTable(){U, V, W}):annotate{name = 'prelu'}
Thank you for providing the TensorFlow implementation. One minor fix.
As per Equation (3) in the original paper, it should be addition:
h_j^~ <- \phi(U h_j + V w_j + W s_t)
rather than multiplication:h_j^~ <- \phi(U h_j * V w_j * W s_t)
This can be further verified in their Torch code:
local candidate_memories = nn.PReLU(opt.edim)(nn.CAddTable(){U, V, W}):annotate{name = 'prelu'}