If I understand correctly, equation 11 in the paper is computed here, where for a batch of 5 and with 3 features, h'(w^t + b) should have a shape of (5,) and w of (3,), thus psi should be a (5, 3), and psi_u (5,). However, in the current implementation psi is (5,) and psi_u is a scalar. So the solution would be the change the dot product for a element-wise product. Is that right or did I make a mistake?
If I understand correctly, equation 11 in the paper is computed here, where for a batch of 5 and with 3 features,
h'(w^t + b)
should have a shape of (5,) andw
of (3,), thuspsi
should be a (5, 3), andpsi_u
(5,). However, in the current implementationpsi
is (5,) andpsi_u
is a scalar. So the solution would be the change the dot product for a element-wise product. Is that right or did I make a mistake?