Closed kzjeef closed 6 years ago
Hi,
k
is the number of matrices and matrix multiplications in one cell. Normally k=3
as described in the paper. But when input and output size don't match, the input is multiplied by an additional W
in the highway connection to change the dimension.
See #12 and #16 for the details about k
.
Similar to nn.LSTM
and nn.GRU
, there will be two sub-RNN modules for two directions when bidirectional=True
. The output of each sub-RNN is then concatenated into the final output, and hence the actual output dimension is n_out*2
. With the highway connection, this becomes:
output_final = gate concat( output_fwd, output_bwd ) + (1-gate) x
Thanks for your clear answer!
Hi @taolei87 ,
I have a question about weight matrix dimension, In the SRUCell code, I found the
k = 4 if n_in != out_size else 3
But When I read the paper, it's only have 3 weight matrix, W, Wf, Wr,And I found the
n_in
will not equal toout_size
when the layer number is 0, but I don't understand why k = 4, what's those weight other than W, Wf, Wr ?below is init code:
below is when in_in is not equal to out_size:
Thanks