quark0 / darts

Differentiable architecture search for convolutional and recurrent networks
https://arxiv.org/abs/1806.09055
Apache License 2.0
3.92k stars 843 forks source link

The discrepancy between DARTS and ENAS for RNN cell searching #156

Open cocovoc opened 3 years ago

cocovoc commented 3 years ago

Hi,everyone I got confused when I read the code. In rnn/model_search.py line 28

ch = masked_states.view(-1, self.nhid).mm(self._Ws[i]).view(i+1, -1, 2*self.nhid) It seems that hidden states of all predecessor share the same matrix: H{3} = WH{0}+WH{1}+W*H{2} Actually, I think right computation is H{3} = W{0,3}*H{0}+W{1,3}H{1}+W_{2,3}H{2}.

Any knows the reason why author uses the same matrix? just only for saving memory?