Closed susht3 closed 6 years ago
hello:
in your code, at every step you seem to compute both big rnn and small rnn: `for t in range(x.size()[1]): embed_ = embed[:, t, :]
h_state_l_, c_l_ = self.large_rnn(embed_, (h_state_l, c_l)) h_state_s, c_s = self.small_rnn(embed_, (h_state_s, c_s))`
but actually we only compute one, we take action firstly and then decide to compute which one, because its motivation is to reduce the computation complexity.
you need to read the paper more carefully, especially Eq.7.
hello:
in your code, at every step you seem to compute both big rnn and small rnn: `for t in range(x.size()[1]): embed_ = embed[:, t, :]
but actually we only compute one, we take action firstly and then decide to compute which one, because its motivation is to reduce the computation complexity.