skhu101 / GM-NAS

Code for our ICLR'2022 paper "Generalizing Few-Shot NAS with Gradient Matching"
MIT License
21 stars 2 forks source link

How do you train the sub-supernets before splitting the supernet by grads? #1

Closed marsggbo closed 2 years ago

marsggbo commented 2 years ago

Thanks for the great work.

I wonder how you train the sub-supernets before splitting the supernet by grads?

Let's take NASBench201 as an example, say we have a sub-supernet with encodings of

 tensor([[1., 0., 1., 1., 0.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 0., 1., 1., 0.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]], device='cuda:0')

Are all operations with the value of 1 involved in the forward and backward processes? Or do you randomly sample only one operation for each edge for each training batch?

skhu101 commented 2 years ago

Hope your problems have been solved. If you have any further questions, welcome to have a discussion with us.