quark0 / darts

Differentiable architecture search for convolutional and recurrent networks
https://arxiv.org/abs/1806.09055
Apache License 2.0
3.91k stars 845 forks source link

About alpha #163

Open andyzgj opened 3 years ago

andyzgj commented 3 years ago

Hi, I have a quick question. Why all 8 cells share a same alpha?

def _initialize_alphas(self):
      k = sum(1 for i in range(self._steps) for n in range(2+i))
      num_ops = len(PRIMITIVES)

      self.alphas_normal = Variable(1e-3*torch.randn(k, num_ops).cuda(), requires_grad=True)
      self.alphas_reduce = Variable(1e-3*torch.randn(k, num_ops).cuda(), requires_grad=True)
      self._arch_parameters = [
        self.alphas_normal,
        self.alphas_reduce,
      ]

The code here looks like each cell share the same alpha. Should't each cell have a independent alpha?

sorobedio commented 2 years ago

according to the paper all normal cell are the same so all normal cells have the same alphas. same for the reduction cells