D-X-Y / AutoDL-Projects

Automated deep learning algorithms implemented in PyTorch.
MIT License
1.56k stars 281 forks source link

Some doubts about the architecture sampling #25

Closed liuzili97 closed 4 years ago

liuzili97 commented 4 years ago

Hi, thanks for your great work. After reading your paper, I have some doubts.

  1. 1 or n architectures are sampled when using n GPUs for searching?
  2. The ok sampled from Gumbel(0,1) has a range of known values, but the A^k{i,h} may vary a lot when adapting different wd. So what's the best range for A?

Thanks.

D-X-Y commented 4 years ago

Thanks for your interest. 1, n architectures for n GPUs 2, good point. It could be a problem. We initialize A with Gaussin(0, 0.001) and do not constrain it during searching.

liuzili97 commented 4 years ago

thanks!