Similarly, at (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L445) r1 (of size dim) is multiplied with tensor.dot(h_, Ux) (of size dim_nonlin) which would imply dim==dim_nonlin. Is my understanding correct? If yes, is there a reason for having dim_nonlin and nin_nonlin?
Hello, It's probable that I just misunderstood the code but I think that in the
param_init_gru_cond
function, https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L390 the variabledim
must equaldim_nonlin
. Same is true fornin
andnin_nonlin
.This is because the matrix
W
andWx
have dimensions(nin, 2*dim)
and(nin_nonlin, dim_nonlin)
respectively (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L339).However, both
W
andWx
are multiplied withstate_below_
(https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L429) which would imply thatnin==nin_nonlin
.Similarly, at (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L445)
r1
(of sizedim
) is multiplied withtensor.dot(h_, Ux)
(of sizedim_nonlin
) which would implydim==dim_nonlin
. Is my understanding correct? If yes, is there a reason for havingdim_nonlin
andnin_nonlin
?Thank you.