Closed yashkant closed 6 years ago
Since ENAS only designs convolution cell and reduction cell, there is no skip connection to prevent gradient vanishing problem. To solve this, we just use the aux loss used in the google Inception networks.
Thanks!
Hi,
Great work on simplifying the original code!
Although, I see you are using an auxiliary head in the child models and are training the child models with the loss from both the final and auxiliary head's output, could you please justify why it is used.
Thanks.